Impact of air pollution on breast cancer incidence and mortality: a nationwide analysis in South Korea

Breast cancer is one of the major female health problems worldwide. Although there is growing evidence indicating that air pollution increases the risk of breast cancer, there is still inconsistency among previous studies. Unlike the previous studies those had case-control or cohort study designs, we performed a nationwide, whole-population census study. In all 252 administrative districts in South Korea, the associations between ambient NO2 and particulate matter 10 (PM10) concentration, and age-adjusted breast cancer mortality rate in females (from 2005 to 2016, Nmortality = 23,565), and incidence rate (from 2004 to 2013, Nincidence = 133,373) were investigated via multivariable beta regression. Population density, altitude, rate of higher education, smoking rate, obesity rate, parity, unemployment rate, breastfeeding rate, oral contraceptive usage rate, and Gross Regional Domestic Product per capita were considered as potential confounders. Ambient air pollutant concentrations were positively and significantly associated with the breast cancer incidence rate: per 100 ppb CO increase, Odds Ratio OR = 1.08 (95% Confidence Interval CI = 1.06–1.10), per 10 ppb NO2, OR = 1.14 (95% CI = 1.12–1.16), per 1 ppb SO2, OR = 1.04 (95% CI = 1.02–1.05), per 10 µg/m3 PM10, OR = 1.13 (95% CI = 1.09–1.17). However, no significant association between the air pollutants and the breast cancer mortality rate was observed except for PM10: per 10 µg/m3 PM10, OR = 1.05 (95% CI = 1.01–1.09).

or null 8,10,11,[21][22][23] statistical confidence. For example, Hystad et al. found a significant association between NO 2 concentration and premenopausal breast cancer while marginal association were found between NO 2 and postmenopausal breast cancer in Canadian women 8 . Conversely, Anderson et al. found significant positive associations between the incidence of postmenopausal breast cancer in European women and NO x , and nickel in PM 10 , but found only suggestive evidence with PM 2.5 , PM 10 , PM coarse , NO 2 , and nickel in PM 2.5 , vanadium in PM 2.5 , and vanadium in PM 10 10 . Incorporating more cases or expanding the cohort population may clarify the evidence, though such efforts would demand resources. There also were ecological studies on the association between breast cancer and air pollution 5,7,9 . Chen et al. 5 and Wei et al. 9 aggregated air pollutant emission data and age-adjusted breast cancer incidence rates in Surveillance, Epidemiology, and End Results Program of the United States National Cancer Institute that covered 199 counties and approximately 100,000 or less female population and found positive correlations. Datzman et al. 7 found significant positive associations between NO 2 , PM 10 , and breast cancer incidence by using healthcare data from a local insurance corporation in Germany that covers approximately 1,000,000 female population and 9,577 incidences.
We performed an ecological study to investigate the associations between CO, NO 2 , SO 2 , O 3 , PM 10 , and breast cancer incidence and mortality rate, in the whole of the 252 administrative districts in South Korea. South Korea has a unique national healthcare insurance system with the full coverage of its 52 million citizens and provides mammography-based breast cancer screening service for all female citizens with age ≥40 in every two-years 24 26 . Moreover, the national breast cancer screening service followed up all breast cancer patients in South Korea by residential address and exhaustively recorded the incidence and mortality rate 24 . These unique healthcare settings surrounding breast cancer make South Korea a natural testing ground to investigate associations between air pollution and breast cancer incidence and mortality rate.

Results
Characteristics of the observed districts throughout the study period are shown in Table 1. In Fig. 1, NO 2 and PM 10 , concentrations and age-adjusted breast cancer incidence and mortality rates are portrayed on the South Korean map. Korean female population as of 2010 (in the middle of the study period) was 24,149,865. The total number of breast cancer incidence in 2004-2013 was 133,373, and the total number of deaths by breast cancer in 2005-2016 was 23,565.
In multicollinearity analysis among the air pollutants, the NO 2 and O 3 concentrations were colinear with a correlation coefficient of −0.862. O 3 concentrations were excluded from the multivariable model due to its lesser correlation (0.659) with the breast cancer incidence rate than that of NO 2 concentrations (0.774). Both NO 2 and O 3 concentrations poorly associated with the breast cancer mortality rates (<0.2 for both). Other air pollutants, CO, SO 2 , and PM 10 showed little evidence of collinearity with <0.7 correlation coefficients. Among the confounding factors, higher education rate was collinear with population density, parity, unemployment rate, breastfeeding rate, and oral contraceptive usage rate with >0.7 correlation coefficients. Among those, only higher education rate was included in the model due to its strongest association with the breast cancer incidence rate (0.832). Altitude, smoking rate, obesity rate, gross regional domestic product (GRDP) per capita, and higher education rate were not significantly collinear among each other, so those five confounding factors were incorporated to the beta regression model in estimating the odds ratio (OR) of CO, NO 2 , SO 2 , PM 10 concentrations for breast cancer incidence and mortality rates as covariates. Table 2 shows the OR estimates by single-pollutant and multi-pollutant multivariable linear regression models adjusted for altitude, higher-education rate, smoking rate, obesity rate, and GRDP per capita. In single-pollutant models, all of the four pollutants, CO, NO 2 , SO 2 , and PM 10 were significantly associated with breast cancer incidence. For example, a district with 10 ppb higher NO 2 concentration suffered from higher OR of breast cancer incidence by 1.14 (95% Confidence Interval: 1.12-1.16). In multi-pollutant models, all of the four pollutants remained associated with the breast cancer incidence rate when additionally adjusted with the other three pollutants. On the other hand, air pollutants' concentrations and the breast cancer mortality rates exhibited subtler associations. CO, NO 2 , and SO 2 showed positive but not significant associations, both in single-and multi-pollutant models. Only PM10 exhibited significant associations with the breast cancer mortality rate both in single-and multi-pollutant models, but with smaller ORs than those with the breast cancer incidence rate.

Discussion
NO 2 and PM 10 concentrations were significantly and positively associated with the breast cancer incidence rate in South Korean female population. This result is consistent with previous studies that reported significantly higher risk of breast cancer incidence 5,7,9,11 , and is partly consistent with studies that reported suggestive 6,8,10,12,14,15,27 or null 11,[21][22][23]28 associations between air pollutants and breast cancer. Our study adds evidence of a significant positive association to the aforementioned studies. It is also important to note that our finding is based on region-based national census data that encompassed the entire female population of a country (24,149,865 in 2010) for more than 10 years, including the complete set of diagnosed breast cancer incidences (132,811) and deaths (23,565) during that period. This region-based national census data could covary out possible confounders, including the data-collection method and any other unknown factors. This adds a new layer of evidence for the association between air pollution and breast cancer incidence rates to the previous studies that had case-control or cohort settings. Differences in the significance of the positive association in our study and (2020) 10:5392 | https://doi.org/10.1038/s41598-020-62200-x www.nature.com/scientificreports www.nature.com/scientificreports/ aforementioned studies may be due to the relatively severer air pollution in South Korea (Table 1) compared with the air pollution measured in European 10,23 or North American cohorts 6,8,11,14,22,27 . If so, countries with severe air pollution, such as China or India, may also exhibit significant positive associations that are similar to our result. Differences in ethnic composition, diet, and culture may have also played a role. In addition, subsequent studies will be needed to uncover the underlying physiological mechanisms and pathways in these associations. As previous studies [14][15][16]19 suggest, NO 2 and PM 10 may exert both endocrine-disrupting and carcinogenic properties. Regions with severe air pollution may co-localize with other endocrine-disrupting agents or carcinogens, thereby exhibiting a harmful association with breast cancer incidence.
On the other hand, no significant association was found between air pollution and breast cancer mortality rates except PM 10 . The breast cancer incidence and mortality rates are positively, but weakly associated in South Korea, with a Pearson's correlation coefficient of 0.150 (p-value: 0.0173). This weak association implies that there are some districts with higher mortality rates than expected with their incidence rates. Many of these districts are located in rural areas, supporting an idea that they are underserved by the healthcare system (including late-detection or late-diagnosis) but we did not find significant differences between these low-incidence-high-mortality districts and the other districts. After the breast cancer diagnosis, a patient may have various treatment and management options that considerably affect the mortality rate, and those are hard to parameterize in a census-based study setting. For instance, living in a polluted urban area may lead to a high incidence rate but not to a high mortality rate, by providing better access to healthcare resources than other parts of the country. Similarly, the higher-education rate is positively correlated with the incidence rate, but not with mortality rate. Higher education is usually associated with a Westernized diet pattern, fewer childbirths, prolonged time-to-first pregnancy, and less breastfeeding. These factors contribute to a higher risk of breast cancer incidence. On the other hand, higher education may lower the mortality rate by promoting patients to seek better means to fight the disease. There are studies reporting that different education groups have disparate treatment and mortality patterns in South Korea 29 and China 30 .
Our breast cancer incidence and mortality statistics are based on a validated national census database, encompassing the whole female population in South Korea. Although this makes our study robust, a limitation also  arises. Because the unit of analysis was a district, not an individual, some information considered to be important in breast cancer incidence could not be obtained, such as: histologic subtypes, stage at diagnosis, menopausal status, molecular subtypes, mammographic breast density, occupational history, and patient-specific exposure to air pollution. Further research is needed to unveil the interplay of breast cancer, air pollution, and these not-yet-studied risk factors. We did not perform a lag analysis due to the lack of temporal resolution in the breast cancer incidence that had been surveyed every 5 years (in 2004-2008 period and 2009-2013 period, only twice in the study period). Because there was no predefined 'duration of exposure' nor a 'lag-year' value from an exposure to air pollution to lead to a breast cancer incidence or mortality case, the average of daily air pollutant concentrations throughout the study period in each district was considered to represent the level of air pollution. More frequent survey on the breast cancer incidence would enable time series analysis and may lead to richer implications to the field. Another limitation is the lack of migration history data. We contend that migrations

conclusions
Our study suggests a positive association between air pollution and breast cancer incidence, but less definitively with the mortality rate. This region-based, nationwide, whole-population study adds a new layer of evidence for the association.
Methods ethical approval. Ethical approval was not required because this study was performed using a publicly accessible, national epidemiology database.
Breast cancer incidence and mortality statistics. Korean Statistical Information Service (KOSIS) 32 is a publicly accessible database that was used to extract the breast cancer incidence and mortality rate, which was classified by the Korean Standard Classification of Disease codes for "breast cancer", C50, and corresponding with the same disease category in the 10th revision of International Statistical Classification of Diseases (ICD-10) and Related Health Problems codes. Mortality statistics were provided from 2005-2016, and incidence statistics were from 2004-2013 by the KOSIS database. There are currently 252 Si-Gun-Gus in South Korea. Si-Gun-Gu is a level in the Korean administrative-area system, similar to the county in the United States. All 252 districts were included in this study, except Ulleung-Gun-a group of islands 130 km away from the east coast, populated by fewer than 10,000 people. The incidence and mortality rates were age-adjusted per 100,000 by the standard populations as of July 1 st , 2010 in South Korea. Breast cancer incidence or deaths in the male population were too scarce to be included in the study, so only the female population was analyzed.
Air pollution. Air pollution data throughout the study period and places were acquired from a publicly accessible database. CO, SO 2 , NO 2 , O 3 , PM 10 , and PM 2.5 concentrations are measured by National Ambient Air Quality Monitoring Information System (NAMIS), and publicly accessible via the AirKorea website. In total, there are 332 measurement stations nationwide. PM 2.5 was not assessed in the current study because of shortage in measurement stations in the early study period. The average of each pollutant, per day, for each station was collected.
Although the air pollution data are based on the measurement stations, the population, incidence, and death statistics are based on the Si-Gun-Gu district system, which is not directly matched to each other. To match and integrate the datasets, we obtained the latitudes and longitudes of each air pollution measurement station and administrative authorities office as a representative location for each district. Then we estimated the average air pollutant concentrations throughout the study period for each administrative office by linearly interpolating air pollutant measurements from the surrounding three stations. Python programming language version 2.7 (Python Software Foundation, Beaverton, Oregon, United States) was used in the procedure.
Considering the nature of cancer incidence that depends on long-term, cumulative exposure to putative carcinogens, a clear-cut exposure timing may not be determined. Rather, we summarized the mean ambient pollutant concentrations throughout the whole study period (2004-2016).
confounding factors. Altitude, population density, higher-education rate, smoking rate, obesity rate, parity, unemployment rate, breastfeeding rate, oral contraceptive usage rate as of the 2010 Census, and gross regional domestic product (GRDP) per capita as of 2011 were accessed for every districts via the KOSIS database and considered as potential confounding factors. The rates are defined as follows: higher education rate (rate of >15-year-old women with equal to or higher education than college education in the district), smoking rate (rate of current female smokers adjusted by the age of the national standard female population), obesity rate (rate of females with BMI >25 adjusted by the age of the national standard female population), parity (number of childbirth per married >15-years-old women), unemployment rate (rate of unemployed >15-years-old women), breastfeeding rate (rate of females with breastfeeding history), oral contraceptive usage rate (rate of females with oral contraceptive usage). Parity, unemployment rate, breastfeeding rate, and oral contraceptive usage rate were provided only per 17 provinces, that is coarser than other covariates provided per 252 districts. For parity, unemployment rate, breastfeeding rate, and oral contraceptive usage rate, districts in the same province were attributed with the same estimates. Statistical analysis. Data are shown as median and interquartile range and the 95% confidence interval (95%CI) where applicable. Multivariable beta regression 33,34 models for the breast cancer incidence rates and mortality rates were built, and odds ratio (OR) of each air pollutant to the incidence and mortality rates were estimated, adjusting for the confounding factors. To estimate the 95% confidence intervals for ORs, a basic bootstrap method was applied. To minimize the multicollinearity in the model, variable pairs with Pearson's correlation coefficients higher than 0.7 were identified, and variables of lower correlation with the breast cancer incidence rate and mortality rates were excluded from the model. R statistics software version 3.6.2 (R Foundation for Statistical Computing, Vienna, Austria) was used in this study.

Data availability
The data that support the findings of this study are available from public databases: Korean Statistical Information Service, http://kosis.kr/; AirKorea, an air quality information system provided by the Korean Ministry of Environment and the Korean Environment Corporation, http://www.airkorea.or.kr/index; SHAPE file of South Korean map available at National Geographic Information Institute of Korea, http://ngii.go.kr.