Breast cancer is the most frequently diagnosed cancer among women worldwide1, and is rapidly increasing in industrialized countries and urban areas. In South Korea, breast cancer is the second most common after thyroid cancer and has annually increased by 6.1% from 1999 to 20142,3. Increased exposure to environmental female hormones is considered to affect the rise of breast cancer incidence. In addition, hormone-dependent cancer is increasing in industrialized countries1,4.

There is growing evidence indicating that air pollution is a risk factor for breast cancer. Nitrogen oxides (NO2 and NOx)5,6,7,8,9, fine particulate matters (PM10 and PM2.5)10,11, and polycyclic aromatic hydrocarbons (PAHs)12,13 are reported to associate with breast cancer incidence. The physiological mechanisms by which air pollutants affect breast cancer are largely explained in two ways. First, air pollutants may directly cause genetic mutations, as they are carcinogenic14,15. Second, air pollutants may affect breast cancer incidence by increasing breast density, which is known to be a risk factor. Yaghjyan et al. reported an association between exposure to PM2.5, O3, and mammographic breast density16. Female hormones affect breast density, and some air pollutants are known to exhibit endocrine-disrupting properties, including xenoestrogens17. However, in the Danish Diet, Cancer and Health cohort (1993–1997) study, little evidence of association between traffic-related air pollution exposure and breast density was found18. Hung et al. reported a positive association between high levels of PM2.5 and breast cancer mortality rate, by considering PM2.5 as a marker for polycyclic aromatic hydrocarbons19.

There is still an inconsistency among the evidence for the association between air pollution and breast cancer risk20. Though Crouse et al.6 found a positive association between NO2 concentration and the breast cancer incidence by 95% confidence, subsequent studies found positive associations with partial8,10,11, marginal or null8,10,11,21,22,23 statistical confidence. For example, Hystad et al. found a significant association between NO2 concentration and premenopausal breast cancer while marginal association were found between NO2 and postmenopausal breast cancer in Canadian women8. Conversely, Anderson et al. found significant positive associations between the incidence of postmenopausal breast cancer in European women and NOx, and nickel in PM10, but found only suggestive evidence with PM2.5, PM10, PMcoarse, NO2, and nickel in PM2.5, vanadium in PM2.5, and vanadium in PM1010. Incorporating more cases or expanding the cohort population may clarify the evidence, though such efforts would demand resources. There also were ecological studies on the association between breast cancer and air pollution5,7,9. Chen et al.5 and Wei et al.9 aggregated air pollutant emission data and age-adjusted breast cancer incidence rates in Surveillance, Epidemiology, and End Results Program of the United States National Cancer Institute that covered 199 counties and approximately 100,000 or less female population and found positive correlations. Datzman et al.7 found significant positive associations between NO2, PM10, and breast cancer incidence by using healthcare data from a local insurance corporation in Germany that covers approximately 1,000,000 female population and 9,577 incidences.

We performed an ecological study to investigate the associations between CO, NO2, SO2, O3, PM10, and breast cancer incidence and mortality rate, in the whole of the 252 administrative districts in South Korea. South Korea has a unique national healthcare insurance system with the full coverage of its 52 million citizens and provides mammography-based breast cancer screening service for all female citizens with age ≥40 in every two-years24. The lifetime screening rate for breast cancer in females with age ≥40 was as high as 83.1% by 201325. Among the Organization for Economic Co-operation and Development nations in 2012, South Korea have the eighth highest age-adjusted breast cancer incidence rate of 52.1 per 100,000, but have the lowest mortality rate of 6.1 per 100,00026. The full coverage of national healthcare insurance and the high screening rate may have facilitated appropriate medical intervention for breast cancer. The relative 5-year survival rate was higher than those of the United States and Japan in similar years: 92.3% in South Korea 2011–2015, 91.1% in the United States 2007–2013, 91.1% in Japan 2006–2008, respectively26. Moreover, the national breast cancer screening service followed up all breast cancer patients in South Korea by residential address and exhaustively recorded the incidence and mortality rate24. These unique healthcare settings surrounding breast cancer make South Korea a natural testing ground to investigate associations between air pollution and breast cancer incidence and mortality rate.


Characteristics of the observed districts throughout the study period are shown in Table 1. In Fig. 1, NO2 and PM10, concentrations and age-adjusted breast cancer incidence and mortality rates are portrayed on the South Korean map. Korean female population as of 2010 (in the middle of the study period) was 24,149,865. The total number of breast cancer incidence in 2004–2013 was 133,373, and the total number of deaths by breast cancer in 2005–2016 was 23,565.

Table 1 Characteristics of the study area.
Figure 1
figure 1

Concentrations of NO2 and PM10, and age-adjusted breast cancer incidence and mortality rates portrayed on the South Korean map. (A) NO2 concentration in average of the study period (2004–2016), (B) PM10 concentration in average of 2004–2016, (C) age-adjusted breast cancer incidence rate in average of 2004–2013, and (D) age-adjusted breast cancer mortality rate in average of 2005–2016, in South Korea.

In multicollinearity analysis among the air pollutants, the NO2 and O3 concentrations were colinear with a correlation coefficient of −0.862. O3 concentrations were excluded from the multivariable model due to its lesser correlation (0.659) with the breast cancer incidence rate than that of NO2 concentrations (0.774). Both NO2 and O3 concentrations poorly associated with the breast cancer mortality rates (<0.2 for both). Other air pollutants, CO, SO2, and PM10 showed little evidence of collinearity with <0.7 correlation coefficients. Among the confounding factors, higher education rate was collinear with population density, parity, unemployment rate, breastfeeding rate, and oral contraceptive usage rate with >0.7 correlation coefficients. Among those, only higher education rate was included in the model due to its strongest association with the breast cancer incidence rate (0.832). Altitude, smoking rate, obesity rate, gross regional domestic product (GRDP) per capita, and higher education rate were not significantly collinear among each other, so those five confounding factors were incorporated to the beta regression model in estimating the odds ratio (OR) of CO, NO2, SO2, PM10 concentrations for breast cancer incidence and mortality rates as covariates.

Table 2 shows the OR estimates by single-pollutant and multi-pollutant multivariable linear regression models adjusted for altitude, higher-education rate, smoking rate, obesity rate, and GRDP per capita. In single-pollutant models, all of the four pollutants, CO, NO2, SO2, and PM10 were significantly associated with breast cancer incidence. For example, a district with 10 ppb higher NO2 concentration suffered from higher OR of breast cancer incidence by 1.14 (95% Confidence Interval: 1.12–1.16). In multi-pollutant models, all of the four pollutants remained associated with the breast cancer incidence rate when additionally adjusted with the other three pollutants. On the other hand, air pollutants’ concentrations and the breast cancer mortality rates exhibited subtler associations. CO, NO2, and SO2 showed positive but not significant associations, both in single- and multi-pollutant models. Only PM10 exhibited significant associations with the breast cancer mortality rate both in single- and multi-pollutant models, but with smaller ORs than those with the breast cancer incidence rate.

Table 2 Odds Ratios (OR) from multivariable beta regression models with air pollutants controlled for altitude, higher-education rate, smoking rate, obesity rate, and GRDP per capita.


NO2 and PM10 concentrations were significantly and positively associated with the breast cancer incidence rate in South Korean female population. This result is consistent with previous studies that reported significantly higher risk of breast cancer incidence5,7,9,11, and is partly consistent with studies that reported suggestive6,8,10,12,14,15,27 or null11,21,22,23,28 associations between air pollutants and breast cancer. Our study adds evidence of a significant positive association to the aforementioned studies. It is also important to note that our finding is based on region-based national census data that encompassed the entire female population of a country (24,149,865 in 2010) for more than 10 years, including the complete set of diagnosed breast cancer incidences (132,811) and deaths (23,565) during that period. This region-based national census data could covary out possible confounders, including the data-collection method and any other unknown factors. This adds a new layer of evidence for the association between air pollution and breast cancer incidence rates to the previous studies that had case-control or cohort settings. Differences in the significance of the positive association in our study and aforementioned studies may be due to the relatively severer air pollution in South Korea (Table 1) compared with the air pollution measured in European10,23 or North American cohorts6,8,11,14,22,27. If so, countries with severe air pollution, such as China or India, may also exhibit significant positive associations that are similar to our result. Differences in ethnic composition, diet, and culture may have also played a role. In addition, subsequent studies will be needed to uncover the underlying physiological mechanisms and pathways in these associations. As previous studies14,15,16,19 suggest, NO2 and PM10 may exert both endocrine-disrupting and carcinogenic properties. Regions with severe air pollution may co-localize with other endocrine-disrupting agents or carcinogens, thereby exhibiting a harmful association with breast cancer incidence.

On the other hand, no significant association was found between air pollution and breast cancer mortality rates except PM10. The breast cancer incidence and mortality rates are positively, but weakly associated in South Korea, with a Pearson’s correlation coefficient of 0.150 (p-value: 0.0173). This weak association implies that there are some districts with higher mortality rates than expected with their incidence rates. Many of these districts are located in rural areas, supporting an idea that they are underserved by the healthcare system (including late-detection or late-diagnosis) but we did not find significant differences between these low-incidence-high-mortality districts and the other districts. After the breast cancer diagnosis, a patient may have various treatment and management options that considerably affect the mortality rate, and those are hard to parameterize in a census-based study setting. For instance, living in a polluted urban area may lead to a high incidence rate but not to a high mortality rate, by providing better access to healthcare resources than other parts of the country. Similarly, the higher-education rate is positively correlated with the incidence rate, but not with mortality rate. Higher education is usually associated with a Westernized diet pattern, fewer childbirths, prolonged time-to-first pregnancy, and less breastfeeding. These factors contribute to a higher risk of breast cancer incidence. On the other hand, higher education may lower the mortality rate by promoting patients to seek better means to fight the disease. There are studies reporting that different education groups have disparate treatment and mortality patterns in South Korea29 and China30.

Our breast cancer incidence and mortality statistics are based on a validated national census database, encompassing the whole female population in South Korea. Although this makes our study robust, a limitation also arises. Because the unit of analysis was a district, not an individual, some information considered to be important in breast cancer incidence could not be obtained, such as: histologic subtypes, stage at diagnosis, menopausal status, molecular subtypes, mammographic breast density, occupational history, and patient-specific exposure to air pollution. Further research is needed to unveil the interplay of breast cancer, air pollution, and these not-yet-studied risk factors. We did not perform a lag analysis due to the lack of temporal resolution in the breast cancer incidence that had been surveyed every 5 years (in 2004–2008 period and 2009–2013 period, only twice in the study period). Because there was no predefined ‘duration of exposure’ nor a ‘lag-year’ value from an exposure to air pollution to lead to a breast cancer incidence or mortality case, the average of daily air pollutant concentrations throughout the study period in each district was considered to represent the level of air pollution. More frequent survey on the breast cancer incidence would enable time series analysis and may lead to richer implications to the field. Another limitation is the lack of migration history data. We contend that migrations would not significantly flaw the current study because only approximately 10% of the population moved between different districts in South Korea between 2003 and 201331,32, and the average duration of living in the same district, according to the 2005 Census, was about 7.7 years31,32.


Our study suggests a positive association between air pollution and breast cancer incidence, but less definitively with the mortality rate. This region-based, nationwide, whole-population study adds a new layer of evidence for the association.


Ethical approval

Ethical approval was not required because this study was performed using a publicly accessible, national epidemiology database.

Breast cancer incidence and mortality statistics

Korean Statistical Information Service (KOSIS)32 is a publicly accessible database that was used to extract the breast cancer incidence and mortality rate, which was classified by the Korean Standard Classification of Disease codes for “breast cancer”, C50, and corresponding with the same disease category in the 10th revision of International Statistical Classification of Diseases (ICD-10) and Related Health Problems codes. Mortality statistics were provided from 2005–2016, and incidence statistics were from 2004–2013 by the KOSIS database. There are currently 252 Si-Gun-Gus in South Korea. Si-Gun-Gu is a level in the Korean administrative-area system, similar to the county in the United States. All 252 districts were included in this study, except Ulleung-Gun—a group of islands 130 km away from the east coast, populated by fewer than 10,000 people. The incidence and mortality rates were age-adjusted per 100,000 by the standard populations as of July 1st, 2010 in South Korea. Breast cancer incidence or deaths in the male population were too scarce to be included in the study, so only the female population was analyzed.

Air pollution

Air pollution data throughout the study period and places were acquired from a publicly accessible database. CO, SO2, NO2, O3, PM10, and PM2.5 concentrations are measured by National Ambient Air Quality Monitoring Information System (NAMIS), and publicly accessible via the AirKorea website. In total, there are 332 measurement stations nationwide. PM2.5 was not assessed in the current study because of shortage in measurement stations in the early study period. The average of each pollutant, per day, for each station was collected.

Although the air pollution data are based on the measurement stations, the population, incidence, and death statistics are based on the Si-Gun-Gu district system, which is not directly matched to each other. To match and integrate the datasets, we obtained the latitudes and longitudes of each air pollution measurement station and administrative authorities office as a representative location for each district. Then we estimated the average air pollutant concentrations throughout the study period for each administrative office by linearly interpolating air pollutant measurements from the surrounding three stations. Python programming language version 2.7 (Python Software Foundation, Beaverton, Oregon, United States) was used in the procedure.

Considering the nature of cancer incidence that depends on long-term, cumulative exposure to putative carcinogens, a clear-cut exposure timing may not be determined. Rather, we summarized the mean ambient pollutant concentrations throughout the whole study period (2004–2016).

Confounding factors

Altitude, population density, higher-education rate, smoking rate, obesity rate, parity, unemployment rate, breastfeeding rate, oral contraceptive usage rate as of the 2010 Census, and gross regional domestic product (GRDP) per capita as of 2011 were accessed for every districts via the KOSIS database and considered as potential confounding factors. The rates are defined as follows: higher education rate (rate of >15-year-old women with equal to or higher education than college education in the district), smoking rate (rate of current female smokers adjusted by the age of the national standard female population), obesity rate (rate of females with BMI >25 adjusted by the age of the national standard female population), parity (number of childbirth per married >15-years-old women), unemployment rate (rate of unemployed >15-years-old women), breastfeeding rate (rate of females with breastfeeding history), oral contraceptive usage rate (rate of females with oral contraceptive usage). Parity, unemployment rate, breastfeeding rate, and oral contraceptive usage rate were provided only per 17 provinces, that is coarser than other covariates provided per 252 districts. For parity, unemployment rate, breastfeeding rate, and oral contraceptive usage rate, districts in the same province were attributed with the same estimates.

Statistical analysis

Data are shown as median and interquartile range and the 95% confidence interval (95%CI) where applicable. Multivariable beta regression33,34 models for the breast cancer incidence rates and mortality rates were built, and odds ratio (OR) of each air pollutant to the incidence and mortality rates were estimated, adjusting for the confounding factors. To estimate the 95% confidence intervals for ORs, a basic bootstrap method was applied. To minimize the multicollinearity in the model, variable pairs with Pearson’s correlation coefficients higher than 0.7 were identified, and variables of lower correlation with the breast cancer incidence rate and mortality rates were excluded from the model. R statistics software version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria) was used in this study.