Introduction

The health threats associated with air pollution are a worldwide public health issue. As an increasing number of studies provide evidence on the relationship between air pollution and diseases, such as obesity, hypertension, diabetes mellitus, cancer, etc., the World Health Organization declared air pollution as a fifth risk factor for noncommunicable diseases in 2018.1 Based on the Global Burden of Diseases Study, Cohen et al. estimated that the number of deaths attributed to exposure to particulate matter <2.5 µm (PM2.5) was 4.2 million in 2015, accounting for 7.6% of global deaths. Compared with developed countries, densely populated and less regulated middle- and low-income countries are faced with more severe challenges.2

The association between air pollution and metabolic abnormalities mentioned above has been proved in both adults and children, suggesting the influence of PM2.5 runs through the whole life cycle.3,4,5,6 Since previous studies found that neonates with birth weight exceeding a normal range were more likely to develop obesity, diabetes mellitus, and hypertension.5,6,7,8 It is worth exploring whether PM2.5 plays a role in metabolic diseases partly through increasing birth weight. The supporting evidence came from a study in Lanzhou, China that reported a positive association between PM10 exposure and fetal overgrowth.9 But generally speaking, the evidence on the association between air pollution and birth weight remained to be insufficient. On the one hand, the results from different studies seem to conflict; apart from overgrowth, restricted fetal growth was also observed to be related to air pollution;9,10 on the other hand, researchers tended to take the absolute value of birth weight as an outcome regardless of the impact of gestational age on birth weight. Compared to macrosomia (birth weight ≥4000 g), large for gestational age (LGA), defined as neonates with a weight above a certain percentile for gestational age, is a more appropriate measure of abnormal birth weight. LGA includes a wider range of neonates with intrauterine overgrowth, some of whom might weigh <4000 g and could not be diagnosed as macrosomia due to prematurity, but are also at high risk of perinatal complications and future obesity.8,11

Therefore, we aimed to explore the relationship between prenatal PM2.5 exposure and the incidence of LGA in a nationwide prospective study conducted in China, a country with poor air quality.

Methods

Study design

The cohort study was conducted in the frame of the National Free Preconception Health Examination Project, which was a project that provided women preparing for pregnancy with health examinations across mainland China. Two hundred and twenty experimental counties in 31 provinces, municipalities and autonomous regions were selected in the pilot project from 1 January 2010 to 31 December 2012, and all rural and migrant women who planned to get pregnant and agreed to participate in the project were consecutively included. During the project, women received a pregestational baseline examination and were followed up throughout gestation until delivery.12 Apart from routine physical and laboratory examinations, the factors that might influence pregnancy outcomes were also recorded through face-to-face interviews, including sociodemographic characteristics, medical history, family history, and lifestyle habits (smoking, alcohol consumption, etc.). All examinations and information acquisition were conducted by qualified staff following the standard protocol. Women who failed to complete the follow-up examination and interview were excluded, as well as those with stillbirth, multiple births, and missing or extreme data. The study was approved by the National Health Commission of China, and informed consent was obtained from participants.

Outcome assessment

The outcome was LGA birth, defined as a newborn whose birth weight is above the 90th percentile for gestational age.13 The 2015 Chinese reference of birth weight for gestational age was used to construct LGA in our study population.14 The gestational age was measured as the number of weeks between the date of delivery and the self-reported last menstrual period date.

Exposure assessment

Individual PM2.5 exposure levels were evaluated using monthly county-specific PM2.5 concentrations according to women’s registered residences. The PM2.5 concentration in all included counties was estimated based on satellite data using an ensemble machine learning model, which was a combination model of random forest, generalized additive model, and XGBoost. The model was trained using PM2.5 monitoring data obtained from 1593 monitoring stations and satellite retrieved aerosol optic depth from 2013 to 2019. In order to overcome spatial heterogeneity, the training process was conducted in different regions of China separately via a spatial clustering method. The prediction accuracy of the model was validated through by-year cross-validation, and data during the 2008 Beijing Olympic Games were used as external validation as well. A high accuracy of historical PM2.5 concentration prediction was achieved using this prediction model (cross-validation R2 0.76, root mean square error 16 µg/m3). More details about the exposure assessment could be available in the study by Xiao et al.15 The PM2.5 exposure level in each province in mainland China was estimated using the average of PM2.5 concentration in its experimental counties. The average monthly PM2.5 concentration during the first to the third month of gestation, fourth to the sixth month of gestation, and seventh month of gestation to delivery was calculated as the trimester-specific PM2.5 concentration in the first, second, and third trimesters. We excluded women whose birthplaces were different from their registered residences to eliminate inaccurate PM2.5 predictions as a result of moving during pregnancy.

Statistical analysis

Binary logistic regression was conducted to calculate odds ratios (ORs) and 95% confidence intervals (CIs) to assess the association between a 10 µg/m3 increase in prenatal PM2.5 exposure and LGA. Potential confounding factors were adjusted in three different models: (i) model 1 adjusted for maternal age and prepregnancy BMI;16 (ii) model 2 adjusted for multiparity, gestational age, and neonatal sex in addition to the factors adjusted in model 1;17 (iii) model 3 adjusted for prepregnancy hypertension, prepregnancy diabetes mellitus, family history of diabetes mellitus, cigarette use and alcohol consumption during pregnancy, maternal educational level, and season of delivery in addition to the variables adjusted in model 2.18,19,20,21 The relationship between PM2.5 exposure and LGA, defined as birth weight >97th percentile for gestational age, was evaluated in a sensitivity analysis. We also conducted propensity score matching incorporating the same covariates as those included in model 3: maternal age, prepregnancy BMI, multiparity, gestational age, neonatal sex, prepregnancy hypertension and diabetes mellitus, family history of diabetes mellitus, maternal educational level, gestational cigarette use and alcohol consumption, and season of delivery. Patients were divided into a low-dose exposure group and a high-dose exposure group according to their exposure level during their whole pregnancy; the cut-off value was set as 75 μg/m3 based on Ambient Air Quality Standards (GB 3095-2012).22 Pregnancies in the high-exposure group were matched to low-exposure pregnancies at a ratio of 1:1 using a caliper of 0.02, and the standardized differences in baseline characteristics of unmatched and matched cohorts were calculated to evaluate match performance.23 We analyzed outcomes between the high- and low-exposure groups using logistic regression. Restricted cubic spline models fitted with four knots adjusted for covariates in model 3 were used to evaluate the dose–response relationship between PM2.5 concentration and ORs of LGA, which was compared with the chosen reference PM2.5 of 75 μg/m3. Subgroup analyses were performed to explore the subgroup effects of maternal age (<25, 25–29, ≥30 years) and prepregnancy BMI (<18.5, 18.6–23.9, and ≥24 kg/m2).24 Inverse probability weight (IPW) method was used to evaluate the selection bias that resulted from the exclusion of participants lost to follow-up or with missing data. A stepwise logistic regression model was fitted to find the variables associated with the inclusion and exclusion of participants. Unstabilized weights were estimated with the numerator of 1 and denominator computerized using logistic regression with inclusion (yes/no) as the outcome and factors associated with participants inclusion as variables. The ORs adjusted for missing data using IPW were calculated. Statistical analyses were conducted using R 3.6.1. P values <0.05 in the two-sided test and standardized differences >0.2 were considered statistically significant.

Results

Of 248,501 women in 220 counties who were pregnant during our study period, we excluded 31,539 participants who were lost to follow-up, 4961 who delivered somewhere other than their residence region, 1417 who gave birth to more than one child, 1103 who had a stillbirth, and 13,238 who had extreme or undocumented baseline characteristics (Supplementary Fig. S1). The final study population included the remaining 196,243 births, which consisted of 166,289 (84.7%) non-LGA and 29,954 (15.3%) LGA births. Table 1 summarizes the baseline characteristics of the included population. The distribution of PM2.5 around China and the average PM2.5 level in each stage of pregnancy are shown in Fig. 1. The mean PM2.5 concentration was 79.2 ± 22.8 and 74.6 ± 22.5 µg/m3 during the whole pregnancy in the LGA and non-LGA groups, respectively. The LGA group was exposed to a significantly higher level of PM2.5 than the non-LGA group (P < 0.01).

Table 1 Maternal and neonatal characteristics of the included births.
Fig. 1: Temporal and spacial distribution of PM2.5.
figure 1

a Distribution of exposure to PM2.5 over the entire pregnancy in each province in mainland China based on data from experimental counties. The gray color indicates missing data. b PM2.5 concentration during the entire pregnancy in each province. Boxplots represent the medians, interquartile range, and minimum and maximum interval. c Average PM2.5 concentration during each stage of pregnancy. *P < 0.01.

As shown in Table 2, we found a significant positive association between PM2.5 exposure and LGA outcomes during each trimester of pregnancy. In a fully-adjusted model, the ORs (95% CI) of LGA for every 10 µg/m3 increase of PM2.5 in the first trimester, second trimester, third trimester, and whole pregnancy were 1.055 (1.051, 1.060), 1.078 (1.073, 1.084), 1.042 (1.038, 1.046), and 1.097 (1.091, 1.103), respectively. When we defined LGA as birth weight above the 97th percentile for gestational age, the result was compatible with the main finding (Table 3).

Table 2 Association between 10 µg/m3 increase in PM2.5 concentration and risk of LGAa.
Table 3 Association between 10 µg/m3 increase in PM2.5 concentration and risk of LGA defined as birth weight >97th percentile for gestational agea.

After 1:1 propensity score matching of 90,320 low-exposed pregnancies with high-exposed pregnancies, 170,376 pregnancies with balanced covariates were included in a matched cohort (Supplementary Table S1). The incidence of LGA was 13.1% and 17.1% in the low- and high-exposure groups, respectively. Propensity-score-matching analysis showed results similar to those of the above logistic regression models, with an OR of 1.37 (95% CI 1.33–1.41).

Figure 2 shows the dose–response relationship between the PM2.5 exposure level and the risk of LGA. The risks of LGA increased as the PM2.5 concentration increased with a nonlinear relationship. The shape of the dose–response curve during the whole pregnancy (Fig. 2d) indicated that the risk of LGA increased more rapidly at high levels of PM2.5 exposure than at low levels of exposure.

Fig. 2: Relationship between PM2.5 and LGA.
figure 2

Dose–response relationships between PM2.5 concentration and odds ratio of LGA compared with the chosen reference PM2.5 of 75 μg/m3 in the a first trimester, b second trimester, c third trimester, and d entire pregnancy using cubic restricted models.

In subgroup analyses, we found no significant difference in the risk of LGA among BMI and maternal age subgroups (Table 4).

Table 4 Adjusted odds ratios of LGA for each 10 µg/m3 increase in PM2.5 concentration over entire pregnancy in each subgroup.

In IPW analysis, 191,155 births included in the final analysis and 33,897 births excluded from the final analysis with complete records of variables used as potential predictors of inclusion/exclusion were included. Supplementary Table S2 shows the results of the logistic regression model used to predict the probability of being included. Table 5 shows that the ORs are adjusted for missing data using IPW, which did not differ significantly from the results of the main analysis.

Table 5 Association between 10 µg/m3 increase in PM2.5 concentration and risk of LGA using inverse probability weightinga.

Discussion

Our study showed that a 10 µg/m3 increase in prenatal PM2.5 concentration was associated with a 9.7% increase in the risk of LGA. To the best of our knowledge, this is the first nationwide study evaluating the relationship between maternal PM2.5 exposure during pregnancy and neonatal birth weight in the context of LGA in areas with high levels of pollution. It is of significance to use LGA as an indicator of birth weight because birth weights are highly dependent on gestational age. For example, some premature neonates might not meet the diagnostic standard of macrosomia due to prematurity but may weigh more than most other neonates of the same gestational age and could be diagnosed with LGA. These neonates are confronted with similar risks of perinatal complications and future obesity as those with macrosomia.8,11 The severity of air pollution in the study areas also matters. Contradictory to observations in low-pollution countries, a study conducted in highly polluted areas found that prenatal exposure to high levels of PM10 would increase the risk of fetal overgrowth.9

Our results were different from those of most previous studies. Sun et al. summarized a total of 32 studies and reported a statistically significant association between low birth weight and PM2.5 exposure (per 10 µg/m3) (OR 1.090, 95% CI 1.032, 1.150).25 The difference could be explained by two aspects. First, there was a large gap between the PM2.5 level in our study and those in previous studies, in which the mean monitored PM2.5 concentration ranged from 9.4 to 22.1 µg/m3.25 It is interesting that both our study and the study by Zhang et al., which reported a positive association between air pollution and fetal overgrowth, were conducted in highly polluted areas.9 It is possible that maternal behavior patterns might be influenced by serious air pollution (e.g., less outdoor activities), and thus impact birth weights. Second, we adjusted for confounding factors related to demography, anthropometric measurement, medical history, family history, and personal history through logistic regression and propensity score matching. Compared with previous studies mainly adjusting for demographic factors, we adjusted for a more comprehensive set of confounders.26,27 The full adjustment was attributed to a large sample size and the detailed data collection during enrollment and follow-up and might reduce possible confounding bias.

The positive association between trimester-specific PM2.5 concentration and risk of LGA was consistent in all three trimesters. It is generally accepted that genetic factors play a major role in the early stage of pregnancy,28 while nutrition and metabolic factors have more influence as a pregnancy enters its later stage.29 So the consistent relationship through three trimesters suggested both genetic factors and intrauterine environmental factors are involved in the overgrowth associated with LGA. Air pollution could influence fetal growth through epigenetic modification. A study enrolling 527 mother–infant pairs in Zhengzhou, China revealed that prenatal air pollution could influence birth size via DNA methylation.30 Also, a significant relationship between prenatal PM2.5 exposure and cord blood DNA methylation of genes related to fetal growth was observed.31 Air pollution might also play a role in the development of LGA by altering maternal metabolism. Animal studies indicated that exposure to unfiltered polluted air increased insulin resistance and oxidative stress in pregnant rats. Compared to those exposed to filtered air, polluted air-exposed mice had more gestational weight gain,32 which is a recognized risk factor for LGA.33 Therefore, based on our findings, the impact of PM2.5 on LGA lasted for the whole pregnancy. The possible mechanisms involve epigenetic modification and metabolic regulation.

The dose–response relationship between PM2.5 concentration and odds ratios of LGA showed that at high levels of exposure, the risk of LGA increased faster as PM2.5 concentration increased. This result confirmed the importance of air pollution control.

Our study enrolled more than 190,000 pregnant women throughout mainland China, which made our results highly representative. Although there were no monitoring networks in China during our study period, we managed to accurately estimate the PM2.5 concentration on the basis of satellite data using a machine learning method. We first found the PM2.5 exposure level in the LGA group was significantly higher than that in the non-LGA group, and then we robust the relationship between PM2.5 and the risk of LGA using a logistic regression model adjusted for different combinations of confounding factors and propensity-score-matching analysis. The dose–response relationship further suggested that the higher the exposure, the higher the risk of LGA. There were also some limitations of our study. First, we failed to adjust for gestational diabetes mellitus and gestational weight gain in our analysis due to the lack of records. The LGAs accounted for 15.3% of the total population in our study. Theoretically, the LGA proportion in a nationwide study should be around 10% according to the LGA definition. The difference might be partly due to different provinces and target populations selected. Compared with the study plotting the Chinese growth chart, more provinces were included in our study, and rural and migrant women were our target population.14 The living habits and diets vary across China, thus there might be different nutrient supply and gestational weight gain in different samples and then result in different fetal growth and prevalence of LGAs. It is necessary to consider the impact of gestational weight gain on birth weight in the future study. Second, the exposure was estimated at a county level, which means that the participants in the same county were assumed to have the same exposure on the same day. In reality, the PM2.5 distribution in different areas within a county varies. The exposure level of a participant was dependent on the specific residence of the participant, influenced by multiple factors including topography, traffic, greenery coverage, etc. However, we thought the rough estimates ignoring individual differences could still reflect the average exposure of the participants since the internal diversity of PM2.5 level within a county was relatively small, and the main conclusion was credible though less precise. Third, for the lacking of statistical data, we failed to adjust for socioeconomic level of each county in our analysis, which might result in confounding bias. Besides, the program management in different counties might affect participant’s motivation. In some counties with poor management, there might be fewer women willing to participate in the program and result in selection bias. Fourth, we excluded women who delivered in a region different from their registered residence to minimize estimation errors resulting from moving, but there might still have been undetected moves during pregnancy.

Conclusions

In conclusion, our study supported that prenatal exposure to PM2.5 increased the risk of LGA, and the effect was consistent in different exposure windows. Prenatal and perinatal care in highly polluted areas should be reinforced to reduce the likelihood of LGA and relevant complications. The underlying mechanisms of our findings need to be studied further.