Experimental studies and epidemiologic observations suggest that exposure to higher levels of air pollution, specifically ambient particulate matter less than 2.5 µm in diameter (PM2.5), is associated with increased risk of incident chronic kidney disease (CKD), CKD progression, and end stage renal disease (ESRD)1,2,3. A large body of evidence — both mechanistic and epidemiologic research — also suggests that exposure to higher levels environmental air pollution, and in particular PM2.5, is also associated with increased risk of diabetes — a causal driver of CKD4. However, whether the described association between PM2.5 and risk of kidney disease is mediated in part or fully by diabetes is not known. Addressing this knowledge gap will help a) enhance our understanding of how exposure to fine particulate matter air pollution affects kidney function, and b) inform more accurate estimates of the burden of kidney disease and burden of diabetes attributable to PM2.5 pollution5,6. In this work we aimed to address this knowledge gap and built a cohort of United States veterans to estimate the proportion of the association between PM2.5 and adverse kidney outcomes which is mediated by diabetes.


A cohort of 2,444,157 United States veterans were followed over a median 8.5 years (IQR: 8.0–8.8). The geographic distribution of cohort participants is mapped in Supplementary Figure S1. Demographic and health characteristics of the overall cohort and by PM2.5 quartile are provided in Table 1. Compared to the lowest quartile of PM2.5, a higher proportion of those in the highest quartile of PM2.5 were black, were diagnosed with diabetes or were taking a medication for diabetes, and had a higher T0 estimated glomerular filtration rate (eGFR). Adjusted incident rates of kidney disease outcomes increased across increasing PM2.5 quartiles (Fig. 1, Supplementary Table S1).

Table 1 Demographic and health characteristics of the overall study cohort and according to quartiles of annual average PM2.5 concentrations.
Figure 1
figure 1

Adjusted incident rates of adverse kidney disease outcomes by PM2.5 quartile. Rates adjusted for age, race, sex, and T0 eGFR.

To establish the potential role of diabetes as a mediator in the association of PM2.5 with kidney disease outcomes, we first tested the association of PM2.5 and diabetes; our results suggest that a 10 μg/m3 increase in PM2.5 was associated with increased odds of diabetes (Table 2). PM2.5 was also associated with increased risk of kidney disease outcomes (Table 2). We additionally verified the presence and magnitude of the a priori established association of diabetes and risk of kidney disease (Table 2). Thus, diabetes may be a mediator in the relationship of PM2.5 and kidney disease outcomes (Table 2).

Table 2 Associations of ambient fine particulate matter (PM2.5) with diabetes and kidney disease outcomes, and of diabetes and kidney disease outcomes.

In order to resolve concerns about spurious associations and biases around the relationship of PM2.5 with diabetes and kidney disease outcomes, we used ambient air sodium levels, an exposure contextually similar to PM2.5, as a negative control7; there was an insignificant or vanishingly weak association between sodium and diabetes, and sodium and kidney disease outcomes (Supplementary Table S2).

We then conducted mediation analyses to estimate the proportion of the association between PM2.5 and kidney disease outcomes that is mediated by diabetes. The proportion of the association mediated by diabetes was 4.7% (95%CI: 4.3–5.7%), 4.8% (4.2–5.8%), 5.8% (5.0–7.0%), and 17.0% (13.1–20.4%) for incident eGFR <60 ml/min/1.73m2, incident CKD, ≥30% decline in eGFR, and ESRD or ≥50% decline in eGFR, respectively (Table 2 and Fig. 2). Results were consistent when exposure was alternatively assessed using National Aeronautics and Space Administration (NASA) data, where the proportion of the association mediated by diabetes was 5.8% (95%CI: 5.4–6.3%), 6.2% (5.7–6.9%), 7.7% (7.1–8.4%), and 14.6% (13.1–16.2%) for incident eGFR <60 ml/min/1.73m2, incident CKD, ≥30% decline in eGFR, and ESRD or ≥50% decline in eGFR, respectively (Table 3). Mediation results were consistent in sensitivity analyses including: (a) when the exposure was assigned by the nearest air monitoring station within 30 miles (Table 3); (b) when models were additionally adjusted for contextual county level characteristics (Table 3); (c) when accelerated failure time models were used (Table 3); and (d) when the non-composite outcome of ≥50% decline in eGFR was assessed, where we observed a proportion mediated of 16.9% (12.9–19.3%).

Figure 2
figure 2

Proportion of the association between PM2.5 and kidney disease outcomes mediated by diabetes.

Table 3 Sensitivity analyses of the proportion of the association between PM2.5 and kidney disease mediated by diabetes.


While PM2.5 is associated with diabetes, a causal driver of CKD, our findings suggest it mediates a small proportion of the association of PM2.5 and risk of kidney outcomes. The corollary observation is that a significant proportion of the association between PM2.5 and kidney outcomes may reflect a direct relationship. The findings will likely inform more accurate estimation of the burden of kidney disease attributable to air pollution — an issue of rising importance on the global health agenda.

There is increasing realization that pollution is a major driver of non-communicable diseases around the globe8 where the majority (71%) of the 9 million annual deaths attributed to pollution are caused by non-communicable diseases. The Lancet Commission on Pollution and Health and subsequent reports specifically outlined the need to define and estimate the burdens of diabetes and kidney disease attributable to air pollution9,10,11. The United Nations high level meeting on non-communicable diseases, held in September 2018, outlined a shift in framework from the four-by-four to the five-by-five approach and added environmental risk factors as key drivers of non-communicable diseases11. The World Health Organization now formally recognizes environmental air pollution as a risk factor for non-communicable diseases11. Accurate estimation of the burden of non-communicable diseases (including diabetes and kidney disease) is critically important to help inform this effort. The Global Burden of Disease study estimated that 82 million disability-adjusted life-years, a measure of the number of years of healthy life lost, are attributable to PM2.5 in 2016, with 10.1 million due to diabetes12. Disability-adjusted life-years due to CKD attributable to PM2.5 globally in 2016 has been estimated to be 11.5 million13. Our results provides a quantitative estimation of the portion of relationship between PM2.5 and kidney disease that is mediated by diabetes, and suggests that burdens of diabetes and CKD attributable to PM2.5 may only marginally overlap.

The biologic mechanism of a direct relationship between PM2.5 and kidney disease is not entirely clear. Prior experimental and human evidence show that inhaled nano-particles, when sufficiently small, enter the bloodstream and are excreted in the urine in animals and humans14, suggesting a putative size-dependent pathway where inhaled particles may get in contact with kidney tissue to exert their pathologic effect. Experimental evidence in mice and rats suggests that inhalation of PM2.5 leads to significant structural kidney glomerular and tubular changes including tubular atrophy, mesangial expansion, advanced glomerulosclerosis, and decreased glomerular and tubular lumen volumes15,16. Experimental findings from Nemmar and collaborators suggest that exposure to diesel exhaust particles (which experimentally simulate environmental exposure to PM2.5) leads to disturbances of kidney hemodynamics (and alteration of kidney blood flow) and kidney vascular damage, promotes oxidative stress, inflammation, and DNA damage in kidney tissue, exacerbates acute kidney injury, and further promulgates chronic kidney injury in murine models17,18,19. Further research is needed to further define the mechanistic pathway(s) in which PM2.5 adversely affect kidney function and contributes to the biology of CKD.

The magnitude of the proportion mediated by diabetes was higher for the outcome of ESRD or ≥50% decline in eGFR than other kidney outcomes (Table 1). This is most likely a reflection of the strength of the association between diabetes and ESRD relative to the other kidney disease outcomes; CKD due to diabetes progresses more rapidly, and as such it is more likely to lead to a severe and terminal outcome (such as ESRD).

The study has several limitations. The cohort consisted of United States veterans; results may not be generalizable to other populations. Diabetes and PM2.5 were assessed concurrently, and therefore temporality could not be established. Our analyses did not consider the source or chemical composition and toxic content of PM2.5, which may exhibit regional variation, however, studies have shown that estimates using non-specific PM2·5 biomass alone will underestimate the burden of disease attributable to PM2.5 pollution20. In our analyses we used population level exposure estimates rather than individual measurements, which may have resulted in exposure misclassification. Our datasets did not include data on other air pollutants (ambient coarse particulate matter of ≤10 μm in aerodynamic diameter, nitrogen dioxide, and carbon monoxide, and others), and other parameters including temperature, and humidity, and analyses did not account for potential geographic heterogeneity in effect, which may have biased results21. Although we took careful measures to account for potential confounders, and tested a negative exposure control to address potential spurious associations, and we conducted a sensitivity analysis adjusting for contextual county level characteristics as a means of investigating whether shared contextual confounders influenced results, we cannot completely eliminate the possibility that proportion mediated is influenced by residual confounding. Strengths included the use of a large national cohort of US veterans who receive care in a single integrated healthcare network, the use of a negative control to investigate presence of possible hidden bias, and consistency of results across analyses using different data sources to define exposure.

In summary, our results suggest that a small proportion of risk of kidney disease outcomes associated with PM2.5 exposure is mediated by diabetes. A greater understanding of mechanisms underlying the direct relationship between PM2.5 pollution and risk of kidney disease is needed.


Cohort participants

United States Department of Veterans Affairs (VA) Healthcare System users with a minimum of one outpatient eGFR measurement between October 1, 2003 and September 30, 2004 and no previous history of ESRD were selected from VA data, where the date of last eGFR measurement during this time period was designated time zero (T0) (n = 2,751,717). Participants were additionally selected based on the criteria of having at least one outpatient eGFR measurement after T0 (n = 2,680,431), and were followed until September 30, 2012 or death. The final analytic cohort was subsequently chosen by restriction to those who had Environmental Protection Agency (EPA) derived PM2.5 (n = 2,628,465) data and data on all covariates, yielding a final analytic cohort of 2,444,157 (Fig. 3). This study was approved by the Institutional Review Board of the VA Saint Louis Health Care System, Saint Louis, MO. The study was carried out in accordance with relevant guidelines and regulations. A waiver of informed consent was granted by the Institutional Review Board of the VA Saint Louis Health Care System.

Figure 3
figure 3

Cohort flow chart.

Data sources

Participant’s demographics, inpatient and outpatient data, laboratory information, vital signs, and medications were obtained from Department of Veterans Affairs datasets, which is collected through routine care received at the VA Health Care System22,23,24,25,26,27,28. Further details on VA datasets are provided in the Supplementary Methods. Data from the United States Renal Database System (USRDS) obtained through the VA/Centers for Medicare and Medicaid Services (CMS) was utilized in assessing ESRD status29. The Center for Disease Control and Prevention’s (CDC) National Environmental Public Health Tracking Network provided annual particulate matter estimates for the contiguous United States that originate from Community Multiscale Air Quality (CMAQ) modeled output30,31, which is based on United States Environmental Protection Agency’s (EPA) Air Quality System (AQS) data. EPA data also provided information on ambient air sodium levels, as well as the latitude and longitude of the EPA air monitoring stations whose measures were obtained32. NASA Socioeconomic Data and Applications Center Global Annual PM2.5 Grids from Moderate Resolution Imaging Spectroradiometer, Multi-angle Imaging Spectro Radiometer and Sea-Viewing Wide Field-of-View Sensor Aerosol Optical Depth with Geographically Weighted Regression (version 1) remote space-borne satellite sensing data served as an additional source of ambient PM2.5 estimates at the 10 × 10 km resolution33,34. The Census Bureau’s Model-based Small Area Income & Poverty Estimates (SAIPE) supplied annual estimates of county level percent in poverty35. Information on county level population density was obtained from the 2000 Census of Population and Housing35. Latitude and longitude for ZIP code tabulation areas, which are used in place of ZIP codes in order to define a concrete geographic area, was obtained from the 2000 Census Gazetteer File36. Contextual county-level characteristics were curated from the County Health Rankings datasets and encompassed data in several domains including demographics, physical environment, social and economic factors, and health behaviors37. Further information on the county level characteristics may be found in the Supplementary Methods.


Outcomes assessed were time until incident eGFR <60 ml/min/1.73 m2, incident CKD (defined as two eGFR <60 ml/min/1.73 m2 at least 90 days apart23), greater than or equal to 30% decline in eGFR from eGFR at T0, and the composite outcome of ESRD or greater than or equal to 50% decline in eGFR38. Incident eGFR <60 ml/min/1.73 m2 and incident CKD were assessed in sub-cohorts who had no prior history of these outcomes, n = 1,679,965 and 1,616,153 respectively. Patients were censored after onset of ESRD (for non-ESRD outcomes) and at time of death or end of study follow-up. ESRD was ascertained through linkage of USRDS and VA databases. Outpatient eGFR was used in the evaluation of all outcomes except for ESRD. ESRD and greater than or equal to 50% decline in eGFR were combined into a composite outcome due to the low event rate of ESRD. eGFR was calculated using the four-variable abbreviated Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI equation) on the basis of age, race, gender, and serum creatinine39.


Baseline county of residence40 was linked with ambient fine particulate matter air pollutant exposure data from the EPA CMAQ modeled output31. We also linked zip code of residence with exposure data from NASA as an alternative exposure source34. Further details on NASA exposure assignment may be found in the Supplementary Methods. Diabetic status was jointly defined by the presence of International Classification of Diseases (ICD-9) codes and diabetic medication, including oral glycemic agents and insulin, in the year prior to, and including T0, as a three-level variable: no diabetes, diabetes diagnosis but no medication use, and diabetic medication use.


Baseline covariates were ascertained from October 1, 1999 until cohort entry (T0). Covariate selection was informed by prior knowledge1,2,4,41; covariates were chosen based on potential confounding of the association of PM2.5 with diabetes and adverse kidney outcomes, and of the diabetes and adverse kidney outcomes association. Covariates included age, race, gender, cancer, cardiovascular disease, chronic lung disease, hyperlipidemia, hypertension, T0 eGFR, systolic blood pressure, diastolic blood pressure, body mass index, smoking status, angiotensin-converting enzyme inhibitor/angiotensin receptor blocker (ACEI/ARB) use, number of outpatient eGFR measurements, number of hospitalizations, county population density, and county percent in poverty. Covariates were treated as continuous variables where appropriate unless otherwise indicated. Race/ethnicity was categorized as white, black, or other (Latino, Asian, Native American, or other racial/ethnic minority groups). Comorbidities were assigned on the basis of relevant ICD-9-CM diagnostic and procedure codes and Current Procedural Terminology (CPT) codes in the VA Medical SAS datasets using definitions validated for use in VA datasets1,2,25,26,27,42,43,44,45,46,47. The values for systolic and diastolic blood pressure consisted of the average of all measures in the year prior to T0. Body mass index was modeled as a restricted cubic spline. Smoking status was defined as current, former, or never smoker. ACEI/ARB use was defined as use if there were prescriptions for 90 days or greater during the time before T0. Number of outpatient eGFR measurements represented the cumulative number of outpatient eGFR values from October 1, 1999 to T0. Number of hospitalizations was derived from the cumulative number of inpatient stays lasting a full day or longer from October 1, 1999 to T0. Population density and percent in poverty were assigned based on county of residence at T0.


Descriptive statistics are presented as frequency (percent) and mean (standard deviation) or median (interquartile range). Adjusted incident rates, per 100,000 persons, of the adverse kidney disease outcomes were calculated by PM2.5 quartile using a Poisson regression applied to individual level data; rates were adjusted by age, race, sex, and T0 eGFR. Adjusted multinomial logistic generalized estimation equations were used in estimating the association between PM2.5 and diabetes, and Cox proportional hazard models with a robust sandwich variance estimator were utilized in estimating associations between PM2.5 and time until CKD outcomes, and diabetes and time until CKD outcomes. Mediation analyses were conducted using the inverse odds ratio-weighting (IORW) method48,49. Briefly, the IORW method first regresses the exposure, as the outcome in a model, on all mediators and covariates. Results from this model are used to generate weights; results from a weighted and unweighted models provide measures of the direct and total effect, respectively, from which the indirect effect (the proportion mediated) may be calculated. Further details on the inverse odds ratio-weighting method may be found in the Supplementary Methods. Linearity of terms was assessed through restricted cubic spline plots and a Wald chi-squared test for non-linearity. Variables that deviated from linearity were transformed or treated as a restricted cubic spline, determined by minimization of model Akaike information criteria (AIC). PM2.5 and risk of kidney disease outcomes did not show evidence of strong deviation from linearity, so no spline was used. We repeated the mediation analyses using PM2.5 exposure values derived from NASA satellites as an alternative source for exposure data.

The application of negative control in clinical epidemiology studies serves to identify and resolve sources of spurious casual inference7. Measurement of air sodium concentrations occur in the same contextual setting as PM2.5; and there is no biologic basis to support an association of air sodium concentration with either diabetes, or kidney disease outcomes1,2,4. We therefore a) tested the association between ambient air sodium levels and the risk of diabetes, and b) tested the association between ambient air sodium concentrations and risk of kidney outcomes. Missing data was not imputed. In all analyses, a p-values less than 0.05 or a 95% confidence interval containing unity was considered statistically significant. All analyses were conducted in SAS Enterprise Guide 7.1 (SAS Institute, Cary NC).

To test the robustness of study findings we undertook a number of sensitivity analyses: (1) An alternate exposure definition using the air monitoring station nearest a participant’s place of residence, within a maximum 30 miles, was used. (2) Models were additionally adjusted for county level characteristics in domains including demographics, physical environment, social and economic factors, and health behaviors as a means of investigating whether shared regional confounders influenced results37; in this analysis, we conducted a principal component analysis to address multicollinearity amongst the different contextual county characteristics, and then selected components with an eigenvalue greater than 1 for inclusion in our models50. (3) It has been suggested that when an outcome is not rare, mediation assessed using a proportional hazard model such as the Cox model may be biased51; as such, accelerated failure time models with a Weibull distribution, selected based on form of the log negative log plot and best fit via the AIC criteria, were utilized. (4) We analyzed the outcome of greater than or equal to 50% decline in eGFR alone to investigate whether removal of the ESRD (from the composite outcome of eGFR decline >50% and ESRD) outcome modified the results.