Introduction

Ambient air pollution is a main contributor to the global burden of disease, including cardiovascular and respiratory diseases1. Although there is extensive literature on the effects of short- and long-term exposure to ambient air pollution on chronic respiratory diseases2, evidence is limited for long-term exposure and incidence or severity of acute respiratory infections3.

COVID-19, caused by infection by the SARS-CoV-2 virus, mainly presents as an acute respiratory infection. Several risk factors have been identified for progression to severe disease and mortality, such as age, male sex, and chronic comorbidities4. It is well known that air pollutants, both particulate matter and gases, can impair lung defenses against infections5. Additionally, there is evidence showing the potential effect of air pollutants upregulating the expression of SARS-CoV-2 receptors in the lung6. Early in the pandemic, ecological studies reported associations between ambient air pollution and increased risk of hospitalization and death by COVID-197. However, individual-level cohort studies are needed to overcome the multiple methodological limitations of ecological studies on the topic7,8. A number of individual-level studies reported positive associations between long-term exposure to air pollutants and hospital admission or death, particularly for fine particulate matter [PM with an aerodynamic diameter of ≤2.5 µm] (PM2.5) but less consistently for nitrogen dioxide (NO2). These studies followed cohorts of positive COVID-19 cases9,10,11 or selected populations12,13,14,15, and one analyzed the general population16. However, several knowledge gaps remain due to the heterogeneity in observed estimates for COVID-19 severity12,17 and death9,10,12,18, likely because of the limited sample size and the number of events in previous studies, and the lack of multi-pollutant models.

To address these evidence gaps, we analyzed a large population-based cohort of the general population in Catalonia. We investigated associations between PM2.5, NO2, ozone (O3) and black carbon (BC) and hospital and intensive care unit (ICU) admission, hospital length of stay, and death related to COVID-19 during 2020.

Results

Population and exposure characteristics

Figure 1 shows the course of the COVID-19 pandemic during 2020 in Catalonia, Spain. The study flowchart is shown in Supplementary Fig. S1. From 4,669,011 adult individuals alive and residing in Catalonia on March 1, 2020, we excluded 409 (<0.1%) because of loss to follow-up, 589 (<0.1%) because of inconsistent dates, 1512 (<0.1%) missing residential address and 5999 (0.1%) missing air pollutants exposure values, resulting in 4,660,502 individuals included in our analyses.

Fig. 1: Weekly COVID-19 cases and severe related events during 2020 in Catalonia, Spain.
figure 1

The vertical dashed black line refers to the limit between the first and second waves (June 21, 2020).

In 2020, there were 340,608 COVID-19 diagnoses, of which 216,752 (64%) were laboratory confirmed. The majority of COVID-19 diagnoses occurred at the primary care units (249,878; 73%). Among the 340,608 cases, there were 47,174 (14%) COVID-19-related hospitalizations, 4699 (1.4%) ICU admissions, and 10,001 COVID-19-related deaths (3%). Among the 10,001 deaths, 3744 (37%) occurred among non-hospitalized individuals. The median hospital LOS was 7 [p25–p75: 4–14] days. The description of the COVAIR-CAT cohort and COVID-19-related events is shown in Table 1.

Table 1 Characteristics of the cohort overall and according to COVID-19 outcomes

Annual averages (SD) of air pollution in the cohort were 13.9 (2.2) µg/m3 for PM2.5, 26.2 (10.3) µg/m3 for NO2, and 91.6 (8.2) µg/m3 for O3 from the COVAIR-CAT 2019 models. The distribution of these pollutants’ concentrations and the exposure estimates from the ELAPSE models and their correlations are shown in Supplementary Methods.

Associations with COVID-19 severe events

In single-pollutant models (Main Model–Model 4, Table 2 and Fig. 2), higher annual average exposure to PM2.5 and NO2 was associated with a greater hazard of COVID-19-related events. For PM2.5, there were positive associations for hospitalization (HR 1.19, 95% CI, 1.16–1.21), ICU admission (HR 1.16, 95% CI, 1.09–1.24), and death (HR 1.13. 95% CI, 1.07–1.19) per IQR increase. For NO2, there were positive associations for hospitalization (HR 1.25, 95% CI, 1.22–1.29), ICU admission (HR 1.42, 95% CI, 1.30–1.55), and death (HR 1.18, 95% CI, 1.10–1.27) per IQR increase. For both PM2.5 and NO2, positive associations were observed for hospital LOS. In two-pollutant models, NO2 remained positively associated with hospital and ICU admission after adjustment for PM2.5. Similarly, positive associations for PM2.5 remained for hospital admission and hospital LOS after adjustment for NO2. For O3, the association was negative for COVID-19-related events in single-pollutant models and null or positive when co-adjusted for NO2: HR 1.10 (95% CI, 1.02–1.18) for ICU admission and 1.01 (95% CI, 0.95–1.07) for death per IQR in O3. Regarding hospital LOS, O3 was positively associated with hospital LOS in two-pollutant models (Supplementary Table S1, Supplementary Fig. S2). Unadjusted estimates are shown in Supplementary Table S2. Results per one-unit increase in air pollution are shown in Supplementary Table S3.

Table 2 Adjusted associations between long-term air pollutants and COVID-19-related outcomes in single and two-pollutant models
Fig. 2: Sequential adjustment and sensitivity analyses for associations between long-term exposure to NO2 and PM2.5 and COVID-19-related hospitalization (single-pollutant models).
figure 2

These estimates are from the sequential adjustment for confounding (black estimates, models 1–4) and six a priori sensitivity analyses (blue estimates, models 5–10), as described in “Methods”. Error bars refer to the 95% confidence interval from the Cox Proportional Hazards model. cov denotes covariates; M denotes model; SES denotes socioeconomic status.

All associations were comparable with Model 4 (main model) in sensitivity analyses, except when including cases diagnosed at nursing homes and evaluating COVID-19 deaths (Fig. 2; Supplementary Figs. S2, S3, S4, and S5; Supplementary Tables S4, S5, and S6). When evaluating associations by wave, the estimated measures of effect for the first wave were of greater magnitude than for the second wave for hospitalization (Table 3, Supplementary Table S7). The majority (80.4%) of hospital admissions had COVID-19 mentioned as a cause of hospital admission. The association of long-term exposure to air pollutants with hospitalization had also slightly greater magnitude for COVID-19-related hospitalization defined by COVID-19 or respiratory causes, or COVID-19 only, as main causes of admission, compared to all-cause admissions (Table 4, Supplementary Tables S8 and S9).

Table 3 Adjusted long-term associations between air pollutants and COVID-19-related outcomes in single-pollutant models by COVID-19 waves
Table 4 Adjusted long-term associations between air pollutants and COVID-19-related hospitalization in single and two-pollutant models, comparing all-cause with cause-specific hospitalizations

When evaluating the subset with COVID-19 diagnosis, the results were consistent with the main analysis for hospitalizations and ICU admission, although of a smaller magnitude than the main analysis for NO2 and PM2.5. The associations with death were null in the whole period while positive in the second wave for NO2 and PM2.5 (Supplementary Tables S10, S1, and S12). Overall, there were no associations for O3, except positive associations for death during the first wave. Effect estimates based on the COVAIR-CAT exposure models for 2018 and ELAPSE-2010 were broadly comparable to those in the main analyses (Supplementary Tables S13 and S14). For BC, there were positive associations for hospitalizations (HR 1.19, 95% CI, 1.16–1.22), ICU admissions (HR 1.19, 95% CI, 1.10–1.28), deaths (HR 1.06, 95% CI, 1.00–1.13) and hospital LOS (IRR 1.04, 95% CI, 1.02–1.07).

There was no clear evidence of departure from linearity for the association between NO2 and PM2.5 and COVID-19-related hospitalizations, ICU admissions, and deaths (Supplementary Figs. S5, S6, and S7), particularly in the most common exposure range.

Discussion

We observed a positive association between long-term exposure to PM2.5 and NO2 with severe COVID-19 in this large population-based cohort of adults in Catalonia, Spain, a country with a high burden of COVID-19 in 2020. In sensitivity analyses, associations were stable in two-pollutant models when accounting for different adjustments and when using different outcome definitions and air pollutant exposure models. O3 was positively associated with severe outcomes when adjusted by NO2.

Our estimates for long-term PM2.5 and COVID-19-related hospitalization are consistent with other cohorts of COVID-19 cases9,10. The association for hospitalization ranged from an odds ratio of 1.06 (95% CI, 1.01–1.12, per 1.7 μg/m3 (IQR) increase) to HR of 1.24 (95% CI, 1.16–1.32, per 1.5 μg/m3 (SD) increase) in analyses conducted in 150,000 COVID-19 cases in Ontario, Canada10 and 75,000 cases in California, US9. In contrast with our findings, analyses in these two cohorts and other individual-level studies12,15,18 observed no evidence of an association between long-term NO2 and hospitalization. Regarding BC, there was no evidence of an association with severe COVID-19 in two studies that evaluated this pollutant13,15. The limited sample size and selected population could explain the differences in our findings.

Overall, our estimates are slightly greater in magnitude than the previous literature for COVID-19 hospitalization (Supplementary Table S15), although a direct comparison is not straightforward because of differences in exposure assessment, confounder adjustment, and outcome definition. One possible explanation for the observed differences is that we analyzed a population-based cohort; thus, our estimates encompassed the risk of infection and the associated risk of severe COVID-19 following infection. In contrast, cohorts including only COVID-19-diagnosed individuals estimated the risk of severe COVID-19 following infection19. We observed greater estimates during the first wave, which may reflect higher levels of susceptibility to severe COVID-19 compared to the second wave or unmeasured contextual confounding factors such as spatiotemporal patterns in health system capacity, which were less influential in the second wave.

Estimates for the association of long-term air pollution exposure with COVID-19 death are more inconsistent in the literature compared to those for hospitalization10,11,12,13,16,18. A population-based cohort study from the general adult population in Rome (n = 1,594,308) observed an HR of 1.08 (95% CI, 1.03–1.13, per IQR 0.92 μg/m3 increase) for long-term PM2.5 and 1.09 (1.02–1.16, per IQR 9.22 μg/m3 increase) for the long-term NO2 for COVID-19-related deaths16. These estimates are smaller than in a population-based cohort of COVID-19 cases (n = 3,139,804) in California, US11, where the estimated long-term PM2.5 association with death was a RR of 1.04 (95% CI, 1.03, 1.05), and similar to the estimate in this study (HR of 1.04, 95% CI, 1.02–1.06, both for 1 μg/m3 increase in PM2.5). Nevertheless, a cohort with 150,000 COVID-19 cases in Canada reported null associations for death, while positive association for hospital and ICU admission10; a cohort of a selected population from the UK (UK-Biobank cohort, n = 424,721) observed null results for death for PM2.5 (HR 1.00, 95% CI, 0.89–1.11, per IQR 1.27 μg/m3 increase) and NO2 (HR 1.03, 95% CI, 0.90–1.16, per IQR 9.93 μg/m3 increase)12.

The estimates for the association between O3 and severe COVID-19 are hard to interpret because of its high negative correlation with the other pollutants, particularly NO2 (r = −0.82, Supplementary Methods). This could be observed in two-pollutant models with null or positive estimates, contrasting with single-pollutant models.

We evaluated hospital LOS as a surrogate of the COVID-19 severity and burden in the health system20,21. The LOS is the result of patient severity, delivered care, and hospital performance, reflecting the required number of staff, beds, and devices and associated costs20,21. We observed a positive association between long-term PM2.5 and NO2 with hospital LOS. We observed a greater magnitude of the association between ICU admission and hospitalization compared to death, a pattern also observed in the majority of studies that evaluated more than one severity outcome9,10. Other individual factors may explain these differences in the magnitude of effect, such as frailty, given that frail individuals were more likely to die out of the hospital or were not eligible for ICU care, especially during the first waves.

There are several biological mechanisms through which long-term air pollution could increase the risk of severe COVID-19. An initial hypothesis was that long-term air pollution increases the baseline risk of the population exposed to higher levels, resulting in a greater prevalence of chronic comorbidities associated with severe COVID-19, such as hypertension. In this case, chronic comorbidities associated with long-term exposure to air pollution would mediate the association between long-term exposure and severe COVID-19. Although we did not perform a formal causal mediation analysis22, adjustment for chronic comorbidities associated with air pollution in our sensitivity analysis (model 5) resulted in minimal change in the estimates, similar to findings in other cohort studies11,16, suggesting other direct pathways are more relevant. A limitation to interpreting these estimations is that our main model includes a health risk index, in which chronic comorbidities partially contribute to its estimation23. Another hypothesis is that air pollution exposure could facilitate SARS-CoV-2 binding based on evidence that exposure to particulate matter upregulates the expression of SARS-CoV-2 receptors in the lung (e.g., angiotensin-converting enzyme 2)6. If this hypothesis is further validated, it is likely the association between air pollution and severe COVID-19 could be driven mainly by other mechanisms than by increasing the overall population risk due to chronic comorbidities. Exposure to air pollution may also be related to changes in immune defenses that are key to mitigating SARS-CoV-2, such as a decrease in type II interferon response to SARS-CoV-2 and antibody response15,24. All of these hypothesized mechanisms would result in a population susceptible to severe COVID-19; however, further studies are needed to understand the main biological pathways involved.

Strengths of our analysis include the combination of population representativeness spanning large urban and rural areas, with detailed individual-level data for exposures and confounding adjustment in a country heavily affected by the pandemic during 2020, yielding good statistical power and external validity of our analysis. This allowed us to properly evaluate contrasting results in the literature, such as for NO2 and BC. We evaluated two-pollutant models, a range of complementary outcomes including health system burden, several sensitivity analyses, and assessed the shape of the exposure-response function. Additionally, we used a state-of-the-art exposure assessment model developed for COVAIR-CAT for the study period, providing updated estimates of ambient air pollution in the region at fine spatiotemporal resolution.

We evaluated the first year of the pandemic, a period without COVID-19 vaccines and Variants of Concern; thus, our estimates may not be representative of the effect of air pollution on COVID-19 in the later phases of the pandemic. However, Chen et al. observed positive associations between ambient PM2.5 and NO2 and severe COVID-19 after extending the follow-up of an earlier analysis of COVID-19 patients9 to include the Delta Variant of Concern period25. By extending the follow-up, the authors could evaluate the role of vaccination status; initial results showed an association between ambient pollution and severe COVID-19 outcomes in both vaccinated and unvaccinated individuals25.

We lacked data on some individual-level potential confounders, such as race/ethnicity, migration status, physical activity, and occupation. The adjustment for individual-level income could partially adjust for some of these variables, but residual confounding may still be present.

We operationalized our outcome definition based on a time-defined window from clinically or laboratory-confirmed COVID-19 diagnosis. This allowed us to deal with the lack of access to testing during the first wave and avoid selection bias26, although some misclassification in COVID-19 diagnosis may have been present for cases not laboratory confirmed. This pragmatic time-defined definition, used in different studies and policy decisions for COVID-199,16, captured acute complications of COVID-19 occurring within 30 days of infection but could also include some unrelated COVID-19 hospitalizations. However, results from our sensitivity analyses addressing these limitations, such as analyzing only laboratory-confirmed cases and cause-specific hospitalizations, yielded similar estimates. When evaluating the cohorts of COVID-19 diagnosis in sensitivity analyses, we observed smaller estimates compared with the main analysis; however, estimates based only on individuals who were tested are likely affected by selection bias27,28.

Long-term exposure to ambient air pollution was positively associated with severe COVID-19 events, including COVID-19-related hospitalization, ICU admission, and deaths, as well as the length of hospital stay in a large, population-based cohort. Our findings add further compelling evidence on the importance of reducing air pollution levels to improve population health generally and severe acute respiratory infection specifically.

Methods

Study design and population

We constructed a population-based cohort of the adult population of Catalonia (the northeast region of Spain) as part of the COVAIR-CAT study. The COVAIR-CAT cohort was built through record linkage using data collected in the public health administration databases of Catalonia22. The public healthcare system covers nearly the entire population (98.8% of the 7.4 million in 2015)29. Catalonia (32,113 km2) is composed of 947 municipalities grouped in seven health regions (median area of 5425 km2). Health regions administer the public health system, accounting for geographical, socioeconomic, demographic, and health facility availability differences, with the aim of guaranteeing equitable healthcare access. Healthcare management areas (AGA, n = 43, median area 389 km2) are territorial boundaries based on the aggregation of nested primary care service areas (ABS, n = 374, median area 14 km2. Maps of the health areas are shown in Supplementary Methods). These geographic units are used for the operational planning, coordination, and analysis of the main flows between primary care and basic hospital care.

The original cohort included 5,127,059 adults (≥18 years) residents of Catalonia who were covered by the public healthcare system in 201522. COVAIR-CAT includes all individuals from the cohort who were alive and residing in Catalonia on March 1, 2020 (n = 4,669,011), excluding the population that arrived in Catalonia between the years 2016 and 2020. We followed participants through December 31, 2020. A detailed description of the cohort construction is in Supplementary Methods.

Data were managed to ensure anonymization in accordance with current data protection legislation by the Agency for Health Quality and Assessment of Catalonia (AQuAs). The cohort design, definitions, and analysis plan were pre-specified in a protocol before any data extraction. Any deviance from the initial plan is labeled as post-hoc. We received approval from our local ethics committee Parc de Salut Mar Ethics Committee (CEIM-PS MAR, no. 2020/9610).

Data sources

Participants were identified from the Catalan Central Registry of Insured Persons, which collects sociodemographic, migration, and vital status information using a unique identifier. We used this unique identifier for a deterministic linkage across different administrative databases: primary care (CMDB-AP), urgency care (CMDB-URG), and acute hospital discharge (CMBD-AH), which provided detailed information on comorbidity and hospital and ICU admissions based on International Classification of Diseases (ICD) codes (ICD-09 before 2017 and ICD-10 after 2017)22,30. Additionally, we used data from a surveillance system of SARS-CoV-2 tests performed in Catalonia (SUVEC) to gather information on RT-qPCR and rapid antigen tests among cohort participants. We used other public sources for area-level covariates, such as the 2011 Spanish Census, satellite data, and a COVID-19 pandemic indicator (i.e., weekly test-positive proportion31).

Outcomes

Our primary outcome was COVID-19-related hospitalization. Secondary outcomes were COVID-19-related death, ICU admission, and hospital length of stay (LOS). We defined a COVID-19-related event as events that occurred within 30 days of COVID diagnosis9,16. We defined an individual with a COVID-19 diagnosis as those with a positive RT-qPCR or rapid antigen test (laboratory-confirmed COVID-19 diagnosis) or those with a clinical diagnosis of COVID-19. Clinical diagnosis of COVID-19 was defined by the respective ICD-10 codes, as notified in the administrative healthcare databases. The first COVID-19 diagnosis could be in primary care, urgency care units, or hospitals. We considered COVID-19 diagnoses in the general population, excluding diagnoses at nursing homes in the main analysis, because of their high frailty, markedly different pattern of COVID-19 spread and eligibility for hospital admission compared to the general population, and their clustered air pollution exposure10. For this analysis, we considered only the first COVID-19 diagnosis from March 1, 2020, to December 31, 2020. After identifying the date of the first COVID-19 diagnosis, we defined a COVID-19-related hospitalization as a hospital admission by any cause occurring in the following 30 days and a COVID-19-related death as death by any cause occurring in the following 30 days16. To account for individuals who were first hospitalized and had a subsequent COVID-19 diagnosis, particularly during the first wave of the pandemic in Spain, we also considered hospitalizations that occurred in the previous 10 days of the first COVID-19 diagnosis. For each COVID-19-related hospitalization, we retrieved data on whether the participant was admitted to the ICU and the hospital LOS during that hospitalization.

Exposures

We assessed individual-level exposure to ambient levels of PM2.5, NO2, and O3 from the COVAIR-CAT exposure assessment models. We developed an exposure assessment for daily temperature, PM2.5, PM10, NO2, and maximum 8h-average O3 at a spatial resolution of 250 m for the period 2018–2020 in Catalonia. We used meteorological and air pollution data from the Catalan and Spanish monitoring networks and applied machine learning methods tailored for spatiotemporal prediction (Random Forest-based spatial variable selection)32. From the daily estimates, we obtained the annual average of PM2.5 and NO2 and the warm season average for O3, corresponding to 2019. The station-based nested 10-fold cross-validation R2 was 0.61 for PM2.5, 0.77 for NO2, and 0.87 for O3. In a complementary analysis, we used the annual average estimates of the PM2.5, NO2, O3, and BC derived from land-use regression models developed through the ELAPSE (Effects of Low-Level Air Pollution: A Study in Europe) project for 201033. We assigned the 2019 air pollutant exposures to each participant’s residential address at the start of 2021 or the last available because we did not have the residential address at the start of 2020 as the address registry for 2020 in Catalonia was disrupted by the pandemic.

Detailed information about the COVAIR-CAT and ELAPSE models is provided in Supplementary Methods.

Covariates

We obtained age, sex, individual-level income, and health risk group in 2015 from the Central Registry of Insured Persons. Individual income group was based on the co-payment system for drug dispensations, which largely depends on income22. Individual health risk group is a validated ordinal index that encompasses multimorbidity and levels of patient complexity, accounting for acute, chronic or oncological morbidities, single or multimorbidity, medications, and demand of the health system30,34.

Tobacco smoking status (non-smoker, former smoker, or active smoker), previous chronic comorbidities, and body mass index were obtained from the primary care database. Selected chronic comorbidities were also obtained from the hospital admissions database (e.g., chronic obstructive pulmonary disease)22. Nursing home status for those with COVID-19 diagnosis was obtained from the COVID-19 surveillance system.

Area-level indicators were linked to individuals’ residence addresses. The urbanicity index divided municipalities into towns, urban, and rural areas. The small area socioeconomic index was ascertained at the ABS level22,35, while the deprivation and Gini indexes and the proportion of non-Spanish residents were ascertained at the census tract level22. As a surrogate for public health system accessibility, we derived the distance from the residence to the closest primary care center. Finally, we obtained the weekly test-positive proportion of RT-qPCR and rapid antigen tests at the AGA level.

A detailed description of all covariates is shown in Supplementary Methods.

Data analysis

We described continuous variables using mean ± standard deviation (SD) or median [p25–75] and categorical variables as proportions. There were missing values for tobacco smoking and body mass index covariates. For the main analysis, we considered a missing value on tobacco smoking as a non-smoker because the value is most often omitted for non-smokers in the primary care service, while body mass index was used only for sensitivity analysis after multiple imputations.

We fit Cox proportional hazards models to estimate the association between the 2019 annual average air pollution and COVID-19-related hospitalization, ICU admission, and death, with separate models for each pollutant and outcome. The analyses of COVID-19-related hospitalization, ICU admission, and death were conducted in the whole population19, while the analysis of hospital length of stay was conducted on those individuals with COVID-19-related hospitalization. Our main analyses are based on the COVAIR-CAT estimates for 2019, and we evaluated single- and two-pollutant models. We accounted for the competing event of death when evaluating COVID-19-related hospitalization and ICU admission by censoring a death event using the cause-specific HR framework36,37. Follow-up started on March 1, 2020, and for the primary outcome (COVID-19 hospitalization), right-censoring occurred at the first instance of death, 30 days after the first COVID-19 diagnosis, emigration outside the study area, or the end of the study period. We used the time from March 1, 2020, in days as the time scale in the time-to-event analysis. We assessed the proportional hazards assumption of our models by visual inspection of score residuals plotted against event time. We fitted negative binomial regression models to estimate the association between the 2019 annual average air pollution and hospital LOS among those individuals that were hospitalized38. Measures of association for air pollutants were reported as hazard ratios (HR) or incidence rate ratios (IRR) per interquartile range (IQR) increase, with their 95% confidence intervals (CI).

We performed the following sequential adjustment for all exposures and outcomes, as pre-defined based on a priori theoretical assumptions about the relationship between the covariates and the outcome:

  1. a.

    Model 1, adjusted for age (fitted as a penalized spline with six degrees of freedom, number of degrees of freedom evaluated by AIC value) and sex (strata, 2 levels);

  2. b.

    Model 2, Model 1 plus tobacco smoking status (factor, 3 categories), individual income (factor, 3 categories), and health risk group (factor, 4 categories);

  3. c.

    Model 3, Model 2 plus area-level covariates: small area socioeconomic index (continuous term), the proportion of non-Spanish nationals (continuous term), distance to the closest primary care unit (continuous term) + urbanicity (strata, 3 categories) and average weekly of test-positive proportion (continuous term); and

  4. d.

    Model 4 (main model), Model 3 plus health region (strata, 7 categories).

We performed six sensitivity analyses defined a priori:

  1. a.

    Model 5 included potential mediators (diabetes, chronic obstructive pulmonary disease, obesity, dyslipidemia, hypertension, and other cardiovascular disorders) to Model 4;

  2. b.

    Model 6 included other socioeconomic indexes (inequity index, Gini, and deprivation index) to Model 4;

  3. c.

    Model 7 included multiple imputations with chained equations to impute tobacco smoking status and body mass index, running Model 5 and replacing obesity by body mass index in 10 imputed datasets;

  4. d.

    Model 8 included Model 4, with the outcomes restricted to laboratory-confirmed COVID-19;

  5. e.

    Model 9 was Model 4 but included COVID-19 diagnoses at nursing homes, and

  6. f.

    Model 10 included Model 4 in the subpopulation that did not move ABS between 2015 and 2020.

Additional ad hoc sensitivity analyses based on our main model (Model 4) included: (1) censoring the cohort on December 1, 2020, allowing a maximum of 30 days to occur the event during the follow-up; (2) adding distance to the nearest hospital; (3) adding population density at the census tract level; (4) adjusting the smoking status by using a missing indicator instead of considering the missing as “never” smokers; (5) running the analysis on the cohort with COVID-19 diagnosis; (6) running the analysis on the cohort with COVID-19 diagnosis at the primary care; (7) considering hospitalizations with COVID-19 as the main cause of admission instead of all-cause admissions.

We evaluated the potential nonlinearity for age and body mass index testing three to six degrees of freedom in a penalized spline. We compared the AIC criteria for each model to determine whether nonlinearity was present.

We conducted several complementary analyses. To explore the exposure modeling assessment, we replicated all previous models (Model 1 to Model 10) using PM2.5, NO2, O3, and BC from the ELAPSE model33 and Model 4 (main analysis) using the COVAIR-CAT estimates for 2018. We explored potential effect modification by the first and second COVID-19 waves in Catalonia. We fit a time-stratified Cox proportional hazards model defining strata by Wave 1 (March 1 to June 20, 2020) and Wave 2 (June 21, 2020, to December 31, 2020) in Model 4 of the main analysis. The periods defining waves were defined by splitting the study period into the week with the lowest number of COVID-19 cases in Catalonia (Fig. 1). To explore the definition of COVID-19-related hospitalization, we fit the Model 4 for COVID-19-related hospitalization considering only admissions with COVID-19 or respiratory as the main cause of admission instead of all-cause admissions.

Finally, to explore potential non-linear exposure-response functions between air pollutants and outcomes, we fit Model 4 of the main analysis but allowing for nonlinearity using penalized splines with three degrees of freedom. A detailed description of all models, complementary analyses, and multiple imputations is provided in Supplementary Methods.

All analyses were conducted in R (R Core Team, 2020) software (version 4.1.2).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.