Introduction

Many genes that increase the risk for breast cancer have been identified, including BRCA1,1 BRCA2,2 CHEK2,3 ATM,4 TP535 and PTEN.6 The genes most commonly associated with hereditary breast cancer are BRCA1 and BRCA2. Therefore, many risk assessment models have been developed to predict the probability of carrying a BRCA1 or BRCA2 mutation. These models include empirical models (Myriad prevalence tables and Manchester scoring system),7, 8 genetic models (BRCAPRO and BOADICEA)9, 10 and logistic regression model (LAMBDA).11 However, the performance of these models varies in different ethnic groups. For instance, the widely used BRCAPRO and Myriad II models underestimate the proportion of BRCA1/2 mutation carriers in Asians.12 We have shown that these models significantly underestimate the number of mutation carriers in Korean breast cancer patients; indeed, their performance is worse in Korean populations than in other ethnicities.13, 14 Therefore, a new risk assessment model based on data from Korean patients is needed.

In 2007, a large prospective nationwide database based on the Korean Hereditary Breast Cancer (KOHBRA) study was established to determine the prevalence of BRCA1/2 mutations among breast cancer patients at risk for hereditary breast and ovarian cancer (HBOC).15 According the KOHBRA study, the overall prevalence of BRCA mutations among patients with breast cancer and with a family history of breast or ovarian cancer was 21.7% (BRCA1 9.3% and BRCA2 12.4%), and deleterious mutations were more frequently observed in patients with stronger family histories (higher number of relatives with breast or ovarian cancer).16 Among the patients without a family history of breast or ovarian cancer, the mutation prevalence was 10.0% for patients with early onset (<35 years), 17.7% for patients with bilateral breast cancer, 5.9% for male breast cancer patients and 50.0% for patients with breast and ovarian cancer.17 The mutation prevalence in most high-risk subgroups of Korean breast cancer patients was much higher than the 10% carrier probability threshold that is commonly used for selecting families for BRCA genetic testing. However, the prediction strength of each risk factor for BRCA mutations is less definitive in Korea. The KOHBRA study is ongoing to establish a BRCA carrier cohort and provide accurate data on the prevalence and penetrance of the BRCA1/2 mutations. The large volumes of data from the KOHBRA study have made it possible to develop a BRCA risk prediction model for the Korean population. Our goal was to identify the pathogenic factors associated with BRCA mutations and to develop a Korean BRCA-based mutation risk calculator using a logistic regression model. The performance of the new model was evaluated against the KOHBRA data using a split sample validation method.

Materials and methods

Between May 2007 and August 2012, 2071 female breast cancer patients (probands) at risk for HBOC underwent genetic counseling and testing for the BRCA1 and BRCA2 gene mutations through the KOHBRA study. The eligibility criteria for the KOHBRA study were as follows: (1) patients with breast cancer and a family history of breast or ovarian cancer (familial breast cancer patients); (2) patients with breast cancer without a family history of breast or ovarian cancer (non-familial breast cancer patients), including subjects aged 40 years at diagnosis, those with bilateral breast cancer, male patients or those diagnosed with another primary malignancy related to BRCA mutations; and (3) family members of BRCA1/2 mutation carriers.15

The prospectively collected KOHBRA database contains genetic, clinical and demographic data, including personal cancer history (age at diagnosis, bilateral breast cancer, coexistence of ovarian cancer and other BRCA1/2-related cancers); number of relatives with breast or ovarian cancer among first-, second- and third-degree relatives; and histological characteristics (histological type (invasive breast cancer vs ductal carcinoma in situ), estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) status). Personal and family histories of cancers were primarily obtained based on the epidemiological questionnaire and pedigree including at least three generations was obtained through counseling by the KOHBRA research nurses. The histological type, ER, PR and HER2 status determinations were based on pathology reports from each institution. In general, an immunohistochemical score of >3 (at least 10% positive cells) was used to define the ER and PR positivity and an immunohistochemical score of 3+ was used to define the HER2 positivity. Triple-negative breast cancer (TNBC) was immunohistochemically defined as ER negative, PR negative and lacking overexpression of HER2. More information for data collection of the KOHBRA study can be found in previous protocols and interim report.15

For the analysis, 2071 female probands were included in the study subjects, and they all had a personal history of breast cancer. We defined the proband as the first family member recruited for BRCA1/2 genetic counseling and testing, and familial breast cancer patients as persons who has a family history of breast or ovarian cancer within third-degree relatives. In total, 1669 female breast cancer patients enrolled between May 2007 and December 2010 were used for the model construction (850 familial breast cancer patients and 819 non-familial breast cancer patients); 402 female breast cancer patients enrolled between January 2011 and August 2012 were used for validation. To build the model, we compared the demographics and clinical characteristics of the carriers and non-carriers using the chi-square or Fisher’s exact test as categorical variables. Factors with P-values 0.05 were included in a multivariate logistic regression model constructed using the stepwise selection procedure to identify predictors of the BRCA mutation status. The standard logistic regression formula is as follows:

P’ is the estimated probability of carrying the BRCA gene mutation; ‘β’ is the influence coefficient; ‘β0’ is a constant; and ‘χ’ is the influence factor. The model was developed for familial and non-familial breast cancer patient groups. To assess the performance of the model, we compared the observed and predicted numbers of BRCA1/2 mutation carriers using the chi-square or Fisher’s exact test. We evaluated the ability of our model to discriminate between mutation carriers and non-carriers by means of receiver operating characteristic (ROC) curves and assessing the area under the ROC curve. All data were analyzed using SPSS (version 21, IBM, NY, USA); a P-value 0.05 was considered to be statistically significant. This study was approved by the Seoul National University Bundang Hospital Institutional Review Board (IRB No. B-1007/105-009).

Results

Development of the Korean BRCA risk calculator

In the model set, the mean ages of breast cancer diagnosis in the patients with familial and non-familial breast cancer were 43.5 and 36.9 years, respectively. Of the 1669 patients in the model set, 850 were familial breast cancer patients with a family history of breast or ovarian cancer; 819 were high-risk patients for HBOC without a family history. The overall prevalence of deleterious mutations was 15.8% (264/1669), with 106 (40.2%) in BRCA1 and 156 (59.1%) in BRCA2; 2 (0.8%) subjects carried deleterious mutations in both genes. In total, 199 (23.4%) and 65 (8.0%) mutation carriers were identified in 850 familial and 819 non-familial breast cancer patients, respectively.

Among the familial breast cancer patients, the univariate analysis revealed associations between the BRCA gene mutations and young age at first breast cancer diagnosis (P<0.001), bilateral breast cancer (P<0.001), invasiveness of histological type (P=0.047), TNBC (P<0.001) and a high number of relatives with breast cancer (P<0.001) and ovarian cancer (P<0.001). A personal history of ovarian cancer (P=0.334) and other BRCA-related cancers (P=0.532) were not significantly associated with the BRCA gene mutation (Table 1). The clinical and pathological factors significantly associated with BRCA1/2 mutations were used in the multivariate logistic regression model. With the exception of the histological type, five factors remained as significant predictors of a BRCA gene mutation (Table 2). The probability formula for predicting BRCA mutations in familial breast cancer patients using the logistic regression model is as follows:

Table 1 Univariate analysis of BRCA1/2 mutations and clinical characteristics in familial breast cancer patients in the model set
Table 2 Predictors for BRCA1/2 mutations in familial breast cancer patients in the model set

For familial breast cancer patients, the probability of carrying a BRCA mutation decreased with increasing age at diagnosis from <36 years to 36–45 years to 46 years. We selected this age grouping because it had a better C-statistic than the other age groupings (<35 and 35 years or <50 and 50 years).

Among the non-familial breast cancer patients, an age at breast cancer diagnosis <35 (P=0.021), bilateral breast cancer (P<0.001), presence of both breast and ovarian cancer (P=0.008) and TNBC (P=0.003) were related to the BRCA gene mutations (Table 3). When these factors were included in the multivariate analysis, all were found to be independent predictors of BRCA gene mutations (Table 4). The mutation probability formula in non-familial breast cancer patients is as follows:

Table 3 Univariate analysis of BRCA1/2 mutations and clinical characteristics in non-familial breast cancer patients in the model set
Table 4 Predictors for BRCA1/2 mutations in non-familial breast cancer patients of model set

Validation of the Korean BRCA risk calculator

The prevalence of BRCA1/2 mutations in the validation set was 16.7% (67/402), which is consistent with the model set (P=0.677). Twenty-six (38.9%) patients carried a BRCA1 mutation and 41 (61.2%) patients carried a BRCA2 mutation. The observed proportions of BRCA mutations among the familial and non-familial breast cancer patients were 21.3% (39/183) and 12.8% (28/219); these observations were similar to the model predictions (25.4% and 8.5%, respectively). Table 5 shows the observed and the expected proportions of BRCA1 or BRCA2 mutations in the validation set according to the predicted carrier probabilities. There were no differences between the observed and expected carrier probabilities within each risk category. The area under the ROC curve was 0.756 (95% confidence interval (95% CI), 0.669–0.842) for the familial breast cancer model and 0.620 (95% CI, 0.498–0.741) for the non-familial breast cancer model (Figure 1). Both values were significantly different from 0.5. Table 6 shows the performance of our models for familial and non-familial breast cancer at a carrier probability of 10%. Our model for familial breast cancer exhibited a high sensitivity (94.9%) at this cutoff value.

Table 5 Observed and predicted mutation frequencies in the KOHCal validation set, stratified by predicted carrier probability
Figure 1
figure 1

Receiver operating characteristic (ROC) curves for familial (a) and non-familial (b) breast cancer in the validation set (the area under ROC curve (AUC) for familial breast cancer patients was 0.756 (P<0.001) and the AUC for non-familial breast cancer patients was 0.620 (P=0.041)).

Table 6 KOHCal performance with carrier probability of 10% in familial and non-familial breast cancer

Discussion

We built our model from the largest database of prospective, multi-institutional BRCA1/2 mutation data in Korea. The prevalence of BRCA1/2 mutations was 15.8% of 1669 patients, with 23.4% of the familial and 8.0% of the non-familial breast cancer patients carrying deleterious BRCA1/2 mutations. Younger age at breast cancer diagnosis, bilateral breast cancer, TNBC and a high number of relatives with breast or ovarian cancer were predictors of a BRCA gene mutation in the familial breast cancer. An age at diagnosis <35 years, bilateral breast cancer, presence of both breast and ovarian cancer and TNBC were independent predictors of a BRCA gene mutation in the non-familial breast cancer. These factors were applied to a logistic regression model to develop a Korean BRCA risk calculator, which accurately predicted the BRCA mutation rates for each risk category in the validation set.

Several related factors increase the probability of BRCA mutations, including early-onset breast cancer, bilateral breast cancer, history of both breast and ovarian cancer, male breast cancer and familial clustering of breast or ovarian cancer.7, 18, 19 Various prediction models have been developed based on these factors to identify patients with a high risk of HBOC. The Myriad and BRCAPRO models are the most commonly used models for the convenient selection of BRCA1/2 candidates worldwide. The Myriad model initially involved logistic regression analyses, which identified the following predictive factors of BRCA1/2 mutations: breast cancer before 50 years of age and ovarian cancer in the family.7 BRCAPRO uses Bayes’ theory to calculate the probability of carrying a BRCA mutation. The model is based on published BRCA1/2 mutation frequencies, information about the first- and second-degree relatives, individual cancer status, cancer penetrance, bilateral breast cancer and male breast cancer.9 The approach in our model was oriented to the Myriad I model, which used logistic regression in predicting BRCA1. Our model incorporated personal and familial breast or ovarian cancer histories, as well as the histological characteristics of each breast tumor and TNBC. Previous studies have found that the ER, PR and HER2 negativities of breast cancer are significantly linked to BRCA1 mutations.20, 21, 22, 23 The incorporation of breast tumor markers, such as ER, PR and HER2, has been shown to improve the performance of BRCAPRO and BOADICEA.24, 25, 26 We confirmed that most BRCA1 breast cancers are TNBC (data not shown), although we did not derive a formula to calculate the separate risk for BRCA1 and BRCA2, because our goal was to develop a simple and easy model to give a composite probability for selecting candidates for BRCA1 or BRCA2 genetic testing in Korean populations.

Familial clustering of breast and ovarian cancer is a strong predictor of a BRCA mutation.27 Family studies have documented the relationship of BRCA gene mutation risk and the age of onset of cancer and the number of these cancers among first- and second-degree relatives, and the majority of the clinical guidelines for HBOC recommend genetic risk assessment in cases of two or more individuals with breast and/or ovarian cancer within second-degree relatives. Previously, the result of the KOHBRA study for the prevalence of BRCA mutations among familial breast cancer showed that the prevalence satisfied the 10% cutoff among breast cancer patients with a weak family history (only one relative with breast cancer (17.5%) and third-degree relatives as the closest relatives with breast cancer (14.0%)).16 Therefore, our model for familial breast cancer was developed based on extended family data, including information on third-degree relatives. The National Comprehensive Cancer Network Clinical Practice Guidelines in Oncology for Genetic/Familial High-Risk Assessment: Breast and Ovarian also suggest that close relatives for risk assessment include first-, second- and third-degree relatives. For breast and ovarian analysis of disease incidence and carrier estimation algorithm (BOADICEA), which is a model of genetic susceptibility for breast and ovarian cancer developed by Antoniou et al.,10 the inclusion of extended family data slightly improved the accuracy of the risk prediction of a BRCA mutation compared with the risk prediction obtained using limited information on second-degree relatives.28

Our findings also showed that a higher number of relatives with breast cancer is related to a higher risk of carrying a BRCA mutation, and the age of breast cancer of the proband is an important predictor of a BRCA1/2 mutation when the patient has a weak family history of breast cancer (presence of one relative with breast cancer in the family). In our study, when patients >45 years with breast cancer had only one relative with breast cancer, the predicted prevalence of a BRCA1/2 mutation did not satisfy the 10% probability requirement. In the other age groups (<36 and 36–45 years), patients with one relative with breast cancer had a BRCA1/2 mutation probability of 19.3% and 13.9%, respectively.

Most prediction models include a family history of breast or ovarian cancer, making it difficult to score the risk of a BRCA mutation for a single case of breast cancer in a family. Therefore, we developed a separate risk calculation for isolated breast cancer patients without a family history of breast or ovarian cancer. Our study suggests BRCA1/2 genetic testing is not justified in patients with early-onset breast cancer between the ages of 35 and 40 years who have no other risk factors (the predicted mutation rate is 2.7%). The predicted prevalence in patients with breast cancer aged <35 years without other risk factors (7.4%) and patients with TNBC aged 35 years (5.9%) did not satisfy the 10% probability requirement. However, patients with TNBC aged <35 years had a BRCA1/2 mutation probability of 15.1%, which satisfied the 10% probability requirement. Previous studies suggest the test is valid in isolated breast cancer if it is triple-negative and the patient is <35 years of age at diagnosis.29, 30 Therefore, BRCA1/2 genetic testing should be recommended for young women with an isolated case of breast cancer based on the age-of-onset and the pathological features (ER, PR and HER2 negativities) of the tumor.

Personal history of ovarian cancer is one of the major indicators for BRCA genetic screening. Approximately 10% of unselected incident cases of ovarian cancer have mutations in the BRCA1 and BRCA2 genes.31 In our study, a personal history of ovarian cancer was a predictor of a BRCA1/2 mutation in only the non-familial breast cancer patients. The univariate analysis revealed that a personal history of ovarian cancer in familial breast cancer patients tended to increase the mutation risk but was not statistically significant (odds ratio 2.193, 95% CI 0.364–13.217). This lack of significance was likely observed because there were only five subjects with a personal history of ovarian cancer, which is too small to observe statistical significance in patients who had a strong predictor, such as a family history of breast cancer and/or ovarian cancer (n=850).

Prediction models generally account for the prevalence of BRCA1/2 mutation in specific ethnic populations; therefore, their validity must be verified before they can be applied to different populations. Previously, we evaluated the performance of the Myriad II and BRCAPRO models for Korean breast cancer patients who received BRCA genetic testing.13 The overall prevalence of a BRCA1/2 mutation was significantly higher than that predicted by these models. We also found the models performed poorly for patients with only one relative with breast cancer (probands’ age at breast cancer diagnosis >50 years) and patients with non-familial early-onset breast cancer or bilateral breast cancer. At a 10% threshold, the sensitivities of BRCAPRO and Myriad II among Korean women were only 47.8% and 50.0%, respectively, which are poorer than their sensitivities in other races (70.7–88.2%).32, 33 This finding is due to the penetrance differences, and multiple factors, such as modifying genes, environmental factors, birth cohorts and race, influence the penetrance of BRCA1 and BRCA2 mutations. Our model for familial breast cancer patients successfully predicted the number of mutation carriers and had a high sensitivity (94.9%). For the non-familial model, although it was able to predict the observed mutation proportion according to the predicted probabilities, it had a relatively low sensitivity (53.6%) at a carrier probability of 10%. In evaluating the performance of a prediction model, a sufficient number of test sets in each level of risk categories are required to increase the predictive power of the model. Therefore, it appears to be due to the small and uneven distribution of the number of subjects according to the level of probabilities, as the number of subjects with at least a 10% predicted probability was half of the number of subjects predicted with a probability of <10%. Because the existence of non-familial breast cancer patients with high mutation probabilities is in fact likely to be low, we assume that the non-familial model would be difficult to have a high sensitivity. Although our non-familial model did not show a high sensitivity (at a 10% cutoff value) due to the limited number of test sets in high-risk groups, the proportion of mutation carriers in the different ranges of probabilities corresponded well to the expected mutation rates in both the familial and non-familial models. Therefore, we consider that our models are appropriate to screen candidates for BRCA1/2 gene testing in high-risk patients for HBOC. However, our model may not be representative of women with breast cancer in the general population or from other ethnicities because our subjects were derived from a Korean population of patients at high risk for HBOC. Therefore, the performance of our model should be evaluated before being applied in other populations.

This Korean BRCA mutation prediction model, known as KOHCal (KOHBRA BRCA risk calculator) is available on the KOHBRA study website (www.kohbra.kr). KOHCal is simple to use and data input takes little time. The model provides quick guidance for clinicians to identify candidates for BRCA1/2 mutation testing in Korea. The model will also add to our understanding of BRCA gene mutation risk in breast cancer by quantifying the probability of a BRCA gene mutation.