Incidence of breast cancer attributable to breast density, modifiable and non-modifiable breast cancer risk factors in Singapore

Incidence of breast cancer is rising rapidly in Asia. Some breast cancer risk factors are modifiable. We examined the impact of known breast cancer risk factors, including body mass index (BMI), reproductive and hormonal risk factors, and breast density on the incidence of breast cancer, in Singapore. The study population was a population-based prospective trial of screening mammography - Singapore Breast Cancer Screening Project. Population attributable risk and absolute risks of breast cancer due to various risk factors were calculated. Among 28,130 women, 474 women (1.7%) developed breast cancer. The population attributable risk was highest for ethnicity (49.4%) and lowest for family history of breast cancer (3.8%). The proportion of breast cancers that is attributable to modifiable risk factor BMI was 16.2%. The proportion of breast cancers that is attributable to reproductive risk factors were low; 9.2% for age at menarche and 4.2% for number of live births. Up to 45.9% of all breast cancers could be avoided if all women had breast density <12% and BMI <25 kg/m2. Notably, sixty percent of women with the lowest risk based on non-modifiable risk factors will never reach the risk level recommended for mammography screening. A combination of easily assessable breast cancer risk factors can help to identify women at high risk of developing breast cancer for targeted screening. A large number of high-risk women could benefit from risk-reduction and risk stratification strategies.


Methods
Study population. The Singapore Breast Cancer Screening Project (SBCSP) was a population-based prospective trial of screening mammography in Singapore. In this project, 69,473 women aged 50-64 years were randomly selected and invited for a single two-view mammogram examination from 1994 through 1997 22 . Further details of this program have been previously described 22 . In brief, women were excluded if they had cancers of the breast or other sites (except non-melanoma skin cancer), had mammography done or breast biopsy in the past one year prior to screening, or were pregnant (n = 1,182). Further exclusions were made due to death (n = 468) or invalid address (n = 167). Of the eligible 67,656 women, 41.7% (n = 28,231) participated and were screened as part of SBCSP (see flow diagram in Fig. 1). The concern of the low participation rates was addressed in a previous publication, in brief, the incidence of breast cancer in non-respondents was slightly less than that of women not invited for screening (P = 0.03), however the breast cancer stage distribution did not differ significantly 22 . We further excluded 101 (0.4%) of participants; which included 14 (13.9%) participants with reported age <50 years and 87 (86.1%) participants with missing details on intended study variables. Ethical approval and the waiver of the need for informed consent was approved by SingHealth Centralised Institutional Review Board (REF: 205-001). In addition, this study used existing anonymous data. All research was performed in accordance with relevant regulations.
Identification of breast cancer cases. Incident cases of invasive and non-invasive breast cancer cases diagnosed after study entry (date of screen) until December 2007 (the end of study), were identified through Statistical analysis. Chi-square tests were used to test for significant associations between each risk factor and incidence of breast cancer.
PAR estimates the proportion of disease cases that could be prevented if everyone in the population were shifted to the reference category. Here, the PAR for each risk factor was estimated using the method described by Bruzzi et al., by assuming a case-control study design 7 . This method allows for multivariable adjusted relative risk estimates and requires only the distribution of the risk factors among the case subjects. The PAR was computed using the formula: where pd j is the proportion of all cases in stratum j of the risk factor and RR j is the univariable/ multivariable adjusted relative risk associated with that stratum 9 .
The PAR is useful in estimating how much of the disease burden in the population could be reduced if certain risk factors were eliminated 27 . Individual risk factors may interact in their contributions to overall breast cancer risk. Consequently, PARs for individual risk factors often overlap and the sum may exceed 100% 28 . It is noteworthy that there are several underlying assumptions in the interpretation of PAR. The PAR assumes that the risk factor is causal rather than merely associated with the disease. The PAR also assumes that the elimination of the risk factor does not affect the distribution of other risk factors, which is unlikely to always hold true 16 . In addition, the PAR is sensitive to the reference category chosen and distribution of risk factors in the population -caution needs to be taken when comparing our results to other studies or different time periods 8,10,11 . Where the reference category was not the lowest risk of breast cancer, PAR was estimated by shifting only women in higher risk category to the reference category. For example, in the estimation of the PAR for number of live births, the reference level was 1-2. The reported PAR will then be on nulliparous women (highest risk category) compared to the reference level.
The relative risk (RR j , where j = 1, …, maximum number of strata of the variable of interest) used in the estimation of PAR were estimated using the odds ratios obtained from both univariable/multivariable logistic regression models (age, BMI, age at menarche, and breast density were adjusted as continuous variables, unless they were the risk factor which PAR was estimated, in addition to smoking status, ethnicity, family history, personal history of benign breast disease, number of live births, HRT use), based on the actual dataset. Missing values for continuous variables were replaced by the mean of the variable during adjustment. Bootstrapping using 2,000 iterations, with the estimated RR, was used to estimate the 95% confidence interval of PAR.
Pair-wise combinations of risk factors were studied. The reference category was taken to be the reference risk categories of both risk factors studied -for example, the combination of BMI (reference level = 18.5-24.9) and family history (reference level = no) would have women with BMI of 18.5-24.9 and without family history as the reference category. Similar to the analysis of one risk factor, PAR was estimated by shifting only women in higher risk category to the reference category.
The age-specific absolute risk of developing breast cancer in each risk categories was calculated under the assumption that the average age-specific breast cancer incidence over all risk categories agreed with the population breast cancer incidence. The details of the method have been previously described 29 . Projection of absolute risk distribution was based on the breast cancer incidence rates 30 and mortality rates in Singapore 29,31 . A random selection of 50% of study participants was used to build the logistic model to categorize women with differing risks of breast cancer (<30, 30-60, 60-90, ≥90 th percentiles) based on non-modifiable risk factors (ethnicity, family history of breast cancer, history of benign breast disease, parity, age at menarche, number of live births and age at first live birth). The relative and absolute risks of developing breast cancer by percentiles of non-modifiable risk factors distribution were calculated using the remaining 50% of study participants. Bootstrapping using 2,000 iterations was done to estimate the odds ratios used in the projection of absolute risks. Absolute risk was computed only for non-modifiable risk factors as the values do not change over a woman's lifetime. For comparison, we added BMI, smoking status and breast density to the risk model and estimated the absolute risk based on the new risk categorization of women. The discriminatory accuracy of non-modifiable risk factors on predicting breast cancer risk was evaluated using the area under the receiving operating curve (AUC), estimated using the ethics approval and consent to participate. This  consent for publication. All authors approved the manuscript and consented to its publication.

Results
A total of 28,130 women were included in this study, of whom 474 (1.7%) developed breast cancer. The study population was predominantly ethnic Chinese (84.2%) with smaller percentages of Malays (5.6%), ethnic Indians (5.0%) and other races (5.2%) ( Table 1). The majority of women had at least one biological child (92.7%) and had previously breastfed (63.8%). Majority also reported no contraceptive use (61.8%) and no HRT use (86.7%). Nine in ten women (89.6%) were post-menopausal (the description of post-menopausal women by breast cancer occurrence is presented in Table 1). Further details describing the study population may be found in Supplementary Tables 1-3. Non-Malay ethnicity, family history of breast cancer, personal history of benign breast disease, younger age at menarche, fewer number of live births, and higher breast density were associated with increased risks of developing breast cancer (Table 2), which persisted after adjustment. BMI was not associated with breast cancer risk in the univariate model, but after adjusting for other variables, higher BMI was found to be associated with increased breast cancer risk. Results are discussed separately under sub-sections on modifiable, non-modifiable, reproductive and hormonal risk factors, and breast density below.
Among risk factors significantly associated with breast cancer, the most prevalent risk factor was ethnicity, with 94.4% non-Malays, followed by high breast density (65.2%), BMI ≥25 kg/m 2 (43.4%), and age at menarche ≤13 (35.1%). Other risk factors were of low prevalence; among all women 7.2% were nulliparous, 5.3% had a personal history of benign breast disease and 2.6% had a family history of breast cancer.

pARs of reproductive and hormonal factors.
A smaller proportion of breast cancer cases appeared to be attributable to reproductive and hormonal factors ( Table 2). If all women's age at menarche was aged 14 or older 9.2% (95% CI: 8.2-9.8) of breast cancers could be avoided, 4.2% (95% CI: 3.3-5.0) if all women had at least one child.
pARs of breast density. Up to 48.6% (95% CI: 46.4-50.3) of all breast cancers can potentially be prevented if all women had breast density ≤12.29% ( Table 2). The lowest PAR for combinations of breast density with non-modifiable risk or reproductive and hormonal factors was 32.0% (number of live births, 95% CI: 29.8-33.7) in all women. The majority of our findings were not appreciably different in a subset of post-menopausal women (Tables 2 and 3). The PAR estimates in Chinese women were similar to those of all women (Supplementary  Tables 4 and 5).
Absolute risks based on risk categories of non-modifiable risk factors. The median discriminatory accuracy, measured by AUC, of non-modifiable risk factors on predicting breast cancer risk is 62.9% (interquartile range: 61.7-64.1). Women with the highest risk of breast cancer (>90 th percentile) as categorized by non-modifiable risk factors were 1.98 (95% CI: 1.73-2.27) times more likely to develop breast cancer than women with average risks (30-60 th percentile) ( Table 4). The lifetime risk of developing breast cancer for women in the bottom 30% and top 10% of the risk distribution is 3.1% and 10.2%, respectively ( Fig. 2A). The absolute risk of developing breast cancer in the next 10 years for women at age 50, the widely recommended age to start biennial mammography screening, is 2.3% 33 . In our cohort the absolute risk at age 50 was 1.8% for Chinese, 0.8% for Malays, and 2.1% for Indians. Comparing the highest risk category (>90 th percentile and above) to the lowest risk category (30 th percentile and below), more than two times as many breast cancer cases were found. Women in the bottom 60% of the risk distribution will never reach the risk threshold for screening, while women in the highest risk group will reach the risk level recommended for mammography screening when they are ~41 years old (Fig. 2B).
The inclusion of modifiable risk factors (BMI and smoke) and breast density resulted in slight improvement in the median discriminatory accuracy (AUC [interquartile range]: 65.6 [64. 5-66.8]). The risk model was improved in its ability to differentiate women in the 60-100 percentile from women in the 0-60 percentile ( Supplementary  Fig. 1). The highest risk group (90-100 percentile) would reach the ten-year risk of 2.3% at age ~38 years, ~3 years earlier than the age estimated using only non-modifiable risk factors. (Supplementary Table 6

Discussion
The modifiable risk factor, BMI, has the potential to reduce up to 16.2% of all breast cancer cases in Singapore if all women with high BMI in the population were to attain a BMI of <25 kg/m 2 . High breast density, a strong risk factor for breast cancer, and ethnicity had PAR of close to 50% for breast cancer in Singapore. Emerging evidence has shown that prophylactic treatment with drugs such as Tamoxifen can induce a reduction in breast density, which is in turn linked to decreased breast cancer risk [34][35][36] . Hence, we also examined breast density as a potentially modifiable risk factor. Other studied breast cancer risk factors (smoking, family history of breast cancer, history of benign breast disease, age at menarche, number of live births, age at first live birth, ever breastfeed, contraceptive use, and HRT use) were individually associated with breast cancer PAR of no more than 10%. Sixty percent of the women with breast cancer risk categorized by non-modifiable risk factors will never reach the risk threshold recommended for mammography screening.
The PAR (17.7%) of BMI in our study falls in the range of 2.4% 13 to 22.8% 8 observed in post-menopausal Western women. Excess adipose tissue and increased aromatase activity in obese post-menopausal women may increase their levels of circulating endogenous estrogen, which in turn increases breast cancer risk [37][38][39][40][41][42] . This finding has potential repercussions for public health as BMI is one of the risk factors that are easily measurable and modifiable.
Concordant with previous studies, we found that Malay women were less likely to develop breast cancer 43,44 . Possible reasons reported include the tendency of Malay women to have more children, have their first child at a younger age and breastfeed over longer periods 44 . However, after adjusting for these factors, PAR of ethnicity for breast cancer remained large, 49.3% in all women and 52.4% in post-menopausal women. Additional socio-cultural factors, such as dietary preferences influenced by their religious beliefs may also play a role 43 .
Unlike other risk factors, the high PAR of ethnicity is mainly due to the much larger proportion of Chinese than Malays (i.e. ~80% of the population will have a reduced risk of breast cancer if the risk of breast cancer in Chinese is reduced to the level of Malays). Similarly, the risk associated with breast cancer is high in women with a family history of breast cancer (OR = 2.31). However, few women reported a family history of breast cancer, resulting in fewer potentially avertable breast cancers when family history is considered a risk factor.
Less than 10% of breast cancers in Singapore were attributable to the young age at menarche (age ≤13) after we accounted for other factors like breast density and ethnicity. This is consistent with findings by Tamimi et al. 9 and Barnes et al. 13 , who reported PAR of 8.6% and 7.7% respectively. Sprague et al. reported a higher PAR estimate of 18.8%, using women who were at least 15 years at menarche as the reference group 11 . The association between early menarche and increased breast cancer risk can be attributed to the earlier exposure and higher levels of estrogen experienced by women who had early menarche 45 . Similar to the findings by Li et al. 20 and Park et al. 18 PARs for reproductive factors was small.
Our study in is agreement with results from Western populations that, among the well-known risk factors of breast cancer, breast cancer is highly attributable to high breast density 8 . The association between breast density and breast cancer is well-established and confirmed by many studies since it was first described by Wolfe in 1976 46,47 . Women with dense breasts were reported to have a four to six times greater risk of breast cancer compared to those without any visible density, with 26-28% of all breast cancer cases being attributable to having breast densities greater than 50% 48,49 . While reducing breast density holds higher potential in decreasing the risk of breast cancer as compared to lowering BMI, the use of anti-estrogen drugs such as Tamoxifen to reduce breast density is not common practice [34][35][36] . Obesity is related to a multitude of other diseases such as type 2 diabetes and coronary heart disease 50 . In view of a high prevalence of sedentary behavior and general lack of physical activity, much can be achieved to improve overall health and decrease breast cancer incidence by lowering BMI 51 .
Risk factors such as ethnicity, family history of breast cancer, personal history of benign breast disease, age at menarche, and number live births are by nature non-modifiable. Information on non-modifiable risk factors can help reduce breast cancer incidence by means of stratifying the population at risk of developing breast cancer for targeted screening. The current nationwide screening strategy,which started in year 2002, recommends that women aged 40-49 years go for routine mammography screening every year, and women aged 50 years and above, every two years 17 . However, only 66% of the main target group of women aged 50 to 69 ever had a mammogram, and half of them do not come back for regular screening at two-year intervals, negating the benefit of mammography (Health Promotion Board, Singapore). Conveying how easily assessable non-modifiable risk factors can affect the risk of breast cancer may persuade high risk women to go for regular screening.

Breast cancer in all women
Breast cancer in post-menopausal women www.nature.com/scientificreports www.nature.com/scientificreports/ We acknowledge that our study has some limitations. Our screening population only included women who were at least 50 years of age, of whom ~90% were post-menopausal. As such, our results may not be generalizable to pre-menopausal women and women of younger age. Due to the small numbers of Malay and Indian participants, we were not able to estimate PAR for each risk factors by ethnicity. However, the impact of the lack of ethnicity specific estimates may not be large in Singapore (74.3% of the population is Chinese and 13.4% Malays 21 ). Participation in opportunistic screening during the study period between 1992-1994 was low as mammography screening was expensive 52 . Women attending screening may also be more health conscious, which can potentially underestimate PAR. The sample size was also limited; thus, we could only examine broader absolute risk categories. In the consideration of breast density as a modifiable risk factor, the drugs used to induce breast density reduction may also reduce breast cancer risk through other mechanisms. In this case, the PAR associated with breast density may be overestimated 34 . In addition, the associations between some risk factors and breast cancer may differ according to the tumor subtype 13,53 . For example, Millikan et al. 53 found that basal-like cases exhibited opposite associations to those observed for luminal A for risk factors including parity, age at first pregnancy and breastfeeding. Further stratification by tumor subtype may more clearly reveal the relative importance of risk factors for each subtype. Table 3. Population attributable risk (PAR) for combinations of risk factors. Where the risk factors (as categorical variables) were not studied in combination, the following risk factors were adjusted for breast density (as a continuous variable), body mass index (as a continuous variable), ethnicity, age at recruitment (as a continuous variable), family history of breast cancer, age at menarche (as a continuous variable), age at first live birth, and hormone replacement therapy use. a Categories with undefined odds ratios were not used in the calculation of pd RR j j , however they are included in obtaining pd j of other categories. b Excludes women who are nulliparous.    Table 4. Percentiles and odds ratio used in Fig. 2. a Logistic model (built using the training dataset) of the association of breast cancer and non-modifiable risk factors (ethnicity, family history of breast cancer, history of benign breast disease, age at menarche, and age at first live birth). Cut-off of predicted risk is obtained using the testing dataset. b Odds ratios are obtained using the bootstrap method (2000 iterations) on the testing dataset. CI: Confidence interval, OR: Odds ratio.

Figure 2.
Cumulative lifetime and ten-year absolute risks for developing breast cancer for women in Singapore. Presented by percentiles of risk from non-modifiable risk factors (ethnicity, family history of breast cancer, history of benign breast disease, age at menarche, number of live births, and age at first live birth). Absolute risk was computed only for non-modifiable risk factors as the values do not change over a woman's lifetime. The intersection of the different risk curves with the red dashed line in (B) indicates the age at which women in different risk categories would reach the same ten-year absolute risk (2.3%) of women who start screening at age 50 according to Surveillance, Epidemiology, and End Results (SEER) statistics 33 .