Germline pathogenic variants of 11 breast cancer genes in 7,051 Japanese patients and 11,241 controls

Pathogenic variants in highly penetrant genes are useful for the diagnosis, therapy, and surveillance for hereditary breast cancer. Large-scale studies are needed to inform future testing and variant classification processes in Japanese. We performed a case-control association study for variants in coding regions of 11 hereditary breast cancer genes in 7051 unselected breast cancer patients and 11,241 female controls of Japanese ancestry. Here, we identify 244 germline pathogenic variants. Pathogenic variants are found in 5.7% of patients, ranging from 15% in women diagnosed <40 years to 3.2% in patients ≥80 years, with BRCA1/2, explaining two-thirds of pathogenic variants identified at all ages. BRCA1/2, PALB2, and TP53 are significant causative genes. Patients with pathogenic variants in BRCA1/2 or PTEN have significantly younger age at diagnosis. In conclusion, BRCA1/2, PALB2, and TP53 are the major hereditary breast cancer genes, irrespective of age at diagnosis, in Japanese women.

4. On page 5 the authors state that they studied 11 established hereditary breast cancer genes but in presenting the data the authors state that only some of these reached significance. Given that most of these genes have very well established roles in breast cancer predisposition, why are those other gene de-emphasised. Particularly for STK11, TP53 and CDH1 where the number of carries is small, these will never reach significance but they are well established cancer predisposition genes. On the other hand, genes such as NBN are not well support in the literature and the data presented here shows only one carrier in the cases and 3 in the controls. This result should be noted and compared with previous studies which I believe also show no association breast cancer predisposition. 5. Some of the comments in the manuscript describe features of breast cancer predisposition that have been established for many years in European. While it is interesting (although not unexpected) that the Japanese population show similarities, the manuscript often implies that these features are being observed for the first time. For example, the fact that many elderly women with breast cancer can be BRCA mutation carries is not a new idea but the way it is described suggests this is surprizing.
"Moreover, even in the patients diagnosed at 80 years or over, the proportion 358 of patients with pathogenic variants was five times higher than that of controls (aged 359 60 and over). These results suggest that pathogenic variants in predisposition genes 360 not only contribute to earlyonset breast cancer, but also affect late-onset breast 361 cancer in Japanese." In fact, even for BRCA1/2 the penetrance is nowhere near 100%, so clearly there will be elderly mutation carriers. This and other observations need to be conveyed in context of what has been described in previous studies.
6. The authors should make more of their unique control frequency data. If the masses are included, this is a very large cohort and it would be interesting expand on how the frequency compares with European data. In addition there have recently been some smaller studies of Chinese populations. How does the data compare with those?
Reviewer #2: Remarks to the Author: This is an extremely well written and timely manuscript. The investigators performed a case control association study for variants in coding regions of 11 known hereditary breast cancer genes using a large data set of female and male breast cancer patients (and controls) from a predominately Asian population. 244 pathogenic variants were identified of which 131 were novel.
The impact of this work lies within the power of the large sample set to validate novel pathogenic variants. The authors have beautifully described the characterization of these variants and have thoughtfully discussed relevance to existing work, predominately in multi-ethnic populations.
They appropriately discuss the limitations of the work and the likely impact of these limitations, which is relatively low.
The authors also attempt to validate the use of NCCN guidelines in a Japanese population. While they do discuss the ability of the guidelines to detect the described pathogenic variants, this piece of the work is somewhat distracting and takes away from the impactful discovery rather than augments it. Additionally, the data set involved did not capture all of the information required to assess patients according to NCCN guidelines, so assumptions were made that may impact accuracy of the work. Would consider removing this information and submitting as a separate manuscript; however, this is not a fatal flaw that would limit publication.

Reviewer #1 (Remarks to the Author):
This study records the frequency of germline pathogenic mutations in 11 breast cancer predisposition genes among a Japanese population of 7051 women with breast cancers and 11,241 female controls. In addition, 53 male breast cancer patients and 12,520 male controls were studied. Similar studies have been extensively reported on in Western European populations but this is the largest study of a Japanese population. A strength of the study is the large sample size and the inclusion of controls that seem to be reasonable matched to the cases.
Technically the study appears to have been conducted well and the variant annotation and pathogenicity calling has been conducted rigorously.
Overall the study provides some insight into the prevalence of breast cancer predisposition gene mutations in an Asian population.
We thank the reviewer for carefully reading our manuscript and for providing useful comments.
There are a number of aspects of the manuscript that limit the impact of the work: 1. The extent to which this data is relevant to other Asian populations where there have been few large-scale studies of this kind is unclear. It would be helpful if the authors could indicate how the Japanese population data might reflect Asian populations in general? How will this work change clinical practice in Japan and elsewhere?
[Response 1] Thank you for indicating these points. As for the impact of this study on the clinical practice in Japan, we identified 244 unique pathogenic variants, of which 131 were novel. According to this comment, we expanded this result to calculate how many samples were newly identified as patients with pathogenic variants. When we used only information from ClinVar, we identified 258 patients with pathogenic variants and missed 146 patients. Therefore, this study identified 57% (146/258) more patients with pathogenic variants. The following figure shows analyses for each gene separately.
More than 75% of patients with pathogenic variants in BRCA1/2 could be identified by ClinVar only, however, a small proportion of patients with pathogenic variants in other genes, especially PALB2, CHEK2, ATM, and NF1, were detected. Therefore, this study contributes to improved identification of patients with pathogenic variants in the diagnosis of hereditary breast cancer in Japan, especially with respect to PALB2, CHEK2, ATM, and NF1.
As for the relevance of this study to other Asian population, we refer to Chinese (Int J Cancer 2017, 141, 129-142) and Malaysian (J Med Genet 2018, 55, 97-103) studies.
Both studies sequenced coding regions in BRCA1/2 in >2,000 selected and unselected breast cancer patients, respectively. The Chinese study identified 175 unique pathogenic variants in 247 of 2,991 (8.3%) patients and 15 of 175 (8.6%) pathogenic variants were shared with our study. Similarly, the Malaysian study identified 97 unique pathogenic variants in 121 of 2,575 (4.7%) patients. Of these pathogenic variants, 15 (15.5%) variants were identified in our study. These results suggest that pathogenic variants identified in this study were shared in Asian populations to some extent.
Therefore, this study contributes to the identification of patients with pathogenic variants in the diagnosis of hereditary breast cancer in other Asian countries. However, it will still be necessary to create a list of pathogenic variants based on a large number of samples for improved diagnosis of hereditary breast cancer.
[Add the following description into the Discussion at L.  We identified 113 variants previously noted as pathogenic. Data from this study helped to classify 131 additional variants, resulting in a total of 244 pathogenic variants identified. This increase resulted in the identification of 57% more patients (from 258 to 404) with a pathogenic variant. Supplemental Figure   5 shows this change in each gene. Although more than 75% of patients could be identified by only ClinVar in BRCA1/2, only a small proportion of patients with other genes, especially PALB2 (18%), CHEK2 (8%), ATM (24%), and NF1 (25%), were identified. Therefore, this study contributes to improved identification of patients with a pathogenic variant, especially in genes other than BRCA1/2, in the diagnosis of hereditary breast cancer in clinical practice in Japan. Next we investigated the proportion of pathogenic variants shared between other Asian countries and this study to address how the Japanese data are relevant to other Asian populations. Two studies from China 33 and Malaysia 34 sequenced BRCA1/2 in >2,000 selected and unselected breast cancer patients, respectively.
The Chinese study identified 175 unique pathogenic variants in 247 of 2,991 (8.3%) patients. Of the 175 pathogenic variants, 15 (8.6%) pathogenic variants were identified in this study. Similarly, the Malaysian study identified 97 unique pathogenic variants in 121 of 2,575 (4.7%) patients. Of these pathogenic variants, 15 (15.5%) variants were identified in our study. These results suggest that pathogenic variants identified in this study were shared in Asian populations to some extent. Therefore, this study contributes to the identification of patients with a pathogenic variant in the diagnosis of hereditary breast cancer in other Asian countries. However, it will still be necessary to create a list of pathogenic variants based on a large number of samples for improved diagnosis of hereditary breast cancer.

The male breast cancer data is based on small numbers but it is not
unreasonable to include these in the study. However I am confused as to why the male controls are not included in the overall analysis. While there might be expected to be some minor differences in the frequency of pathogenic mutations in males because there will be no attrition of carriers with breast and ovarian cancer, surely this would be relatively minor?
[Response 2] Thank you for the understanding of the importance of male breast cancer data in this study. The reason why we performed the analysis between men and women separately is that the genetic risk of hereditary breast cancer is known to be different by sex. For example, a previous study that examined 715 male breast cancer patients using multi-gene panel (Breast Cancer Res Tr 2017, 161, 575-586) showed that BRCA2 and CHEK2 were the most frequently mutated genes, whereas BRCA1 was a low-risk gene (OR = 1.8). Thus, the overall analysis by combining men and women will distort the risk of each variant/gene. Therefore, we performed the analysis by sex throughout the manuscript. We have added the following sentence into the Materials and Method.
[Add the following description into the Methods at L. [102][103][104] We analyzed women and men separately, as genetic risk for hereditary breast cancer genes differs between men and women 12 . [Response 3] We agree that the description of the variants in the Results section was too extensive. We have moved the following description from the Results section to the Supplemental Note:

From my perspective the aspect of the study that raises it above the numerous similar studies in
[Move the following description to the Supplemental Note at L. [20][21][22][23][24][25][26][27][28][29] Sequencing of the 11 established hereditary breast cancer genes identified 1,781 germline variants among 7,051 breast cancer cases and 11,241 controls. According to the genomic position, we categorized the variants into 210 disruptive, 1,084 nonsynonymous, and 487 synonymous variants (Supplemental Table 10). Minor allele frequencies (MAF) of these variants in controls were common (MAF ≥ 5%) for 30 variants, low (5% > MAF ≥ 1%) for 27 variants, and rare (MAF < 1%) for 1,724 variants. More than half of the variants (all rare) were not registered at dbSNP147 1 . When we examined the density of variants in each gene, the number of variants was strongly correlated with the gene length (r = 0.953, p = 5.70 x 10 -6 , Supplemental Figure 6). [Response 4] Thank you for the suggestion to highlight the importance of results for additional genes in the Discussion section. We have now added more text to the discussion as follows: [Add the following description into the Discussion at L.  The 11 genes analyzed in this study have been reported previously as hereditary breast cancer genes, but the strength of evidence for association of each gene with breast cancer and disease risk varies. Further, published risk estimates are likely to be inflated for at least some genes due to ascertainment bias 3 . We observed a significant contribution to breast cancer risk in BRCA1/2, PALB2, and TP53. The disease risks of BRCA1/2 and PALB2 are comparable to that previously reported 3 , but the risk of TP53 is largely different (8.5 in this study and 105 in the previous meta-analysis 3 ). This is likely explained by several factors. Firstly, previous estimates were based on studies of "familial" patients presenting with clinical features of Li-Fraumeni syndrome, whereas in this study we calculated disease risk for women unselected for family history of cancer. Second, functional effects differ between variants in TP53, which causes a wide range of symptoms, from the severe form known as Li-Fraumeni to the less severe nonsyndromic predisposition 26 , and it is possible that the variants found in patients with unselected breast cancer have less impact on protein function than those identified in patients with classical Li-Fraumeni syndrome. Among four other genes showing P < 0.05 (PTEN, CHEK2, NF1, and ATM), the disease risks of ATM and CHEK2 were comparable to previous reports 3 . Disease risks for PTEN and NF1 were not reliably estimated, despite strong evidence for association (P < 5 x 10 -4 ), due the low numbers of carriers, indicating need for even larger studies to estimate risk at the population level. Although the association with breast cancer for CDH1 and STK11 has been reported previously for patients for hereditary diffuse gastric cancer 27 and Peutz-Jeghers syndrome 28 , only two and zero Japanese breast cancer patients, respectively, had a pathogenic variant in these genes. That is, CDH1 and STK11 have a limited contribution to breast cancer in unselected Japanese women. The reported contribution of NBN to breast cancer risk was mainly based on one specific variant (c.657del5, rs587776650) in the Slavic population 3, 29 , which was not observed in the Japanese population. Other NBN variants designated as pathogenic using ACMG criteria were observed in only 1 case and 3 controls, providing little support for a role of NBN in Japanese unselected breast cancer patients. However, our study has confirmed the importance of the remaining eight genes in genetic testing in Japan and jointly assessed the disease risk of each gene.

Some of the comments in the manuscript describe features of breast cancer predisposition that have been established for many years in European. While it is
interesting (although not unexpected) that the Japanese population show similarities, the manuscript often implies that these features are being observed for the first time. For example, the fact that many elderly women with breast cancer can be BRCA mutation carries is not a new idea but the way it is described suggests this is surprizing. "Moreover, even in the patients diagnosed at 80 years or over, the proportion of patients with pathogenic variants was five times higher than that of controls (aged 60 and over). These results suggest that pathogenic variants in predisposition genes not only contribute to early-onset breast cancer, but also affect late-onset breast cancer in Japanese." In fact, even for BRCA1/2 the penetrance is nowhere near 100%, so clearly there will be elderly mutation carriers. This and other observations need to be conveyed in context of what has been described in previous studies.

Comparison of variant frequency in controls.
This study analyzed 11,241 female controls, but other studies have used data from the Exome Aggregation Consortium (ExAC) 8 as a control for the estimation of disease risk 19 . We investigated the difference in allele frequency between Japanese women in this study and East Asian (EAS) and non-Finnish European (NFE) populations from ExAC without the Cancer Genome Atlas samples. We focused on rare variants with MAF <0.01 because all pathogenic variants were rare. In this study, we identified 1,724 rare variants, of which 1,011 (58.6%) were polymorphic in the controls and the remaining 713 variants were identified only in cases. However, only 87 (5.0%) and 31 (1.8%) were found in the EAS and NFE populations of ExAC, respectively. The frequency of relevant controls is indispensable for assigning clinical significance at PS4 of the ACMG/AMP guidelines and for estimating disease risk. However, because most rare variants were not found in ExAC, population-matched controls are necessary for appropriate assignment of clinical significance and better estimation of disease risk. They appropriately discuss the limitations of the work and the likely impact of these limitations, which is relatively low.
We thank the reviewer for carefully reading our manuscript and for providing useful comments.
The authors also attempt to validate the use of NCCN guidelines in a Japanese population. While they do discuss the ability of the guidelines to detect the described pathogenic variants, this piece of the work is somewhat distracting and takes away from the impactful discovery rather than augments it. Additionally, the data set involved did not capture all of the information required to assess patients according to NCCN guidelines, so assumptions were made that may impact accuracy of the work. Would consider removing this information and submitting as a separate manuscript; however, this is not a fatal flaw that would limit publication.
[Response] Thank you for raising the important point. As the reviewer indicated, we could not include all information required by the NCCN guidelines because clinical data of Biobank Japan were not collected for the aim of hereditary breast cancer. We agree that this distracts the impact of our discovery and thus have removed all related description from the main text.
However, we compared the proportion of pathogenic variants in each gene between this study and the largest study conducted with 35,409 multiethnic women (Cancer 2017(Cancer , 123, 1721(Cancer -1730 in the Discussion section. To do this comparison, it was necessary to select patients using the NCCN guidelines, as was done in the large study noted previously. Therefore, we have kept the description of the NCCN guidelines for this comparison in the Supplemental Note. Because the purpose of describing the NCCN guidelines changed from assessment of the guidelines to selection of patients according to the guidelines, we have modified the description as follows: [Add the following description into the Supplemental Note at L.

Selection of patients according to NCCN guidelines
We selected patients by the National Comprehensive Cancer Network (NCCN) guidelines for genetic/familial high-risk assessment of breast and/or ovarian cancer (ver. 2.2016) 17 for the comparison of proportion of patients with a pathogenic variant with another study 18 because they selected patients based on the NCCN guidelines. Since Biobank Japan did not collect clinical information of breast cancer as hereditary disease, we did not have some information for family members (a known mutation of hereditary breast cancer genes within the family, and age at diagnosis of breast cancer and histology of ovarian cancer in close relatives). Thus, we slightly modified the criteria as follows. (1) Age at breast cancer diagnosis ≤ 50 years old, (2) triple negative breast cancer diagnosed at ≤ 60 years old, (3) bilateral breast cancer, (4) comorbidity of pancreatic cancer at  In this study, we analyzed women and men separately, as genetic risk for hereditary breast cancer differs between men and women 1 . However, there is a possibility to assign more variants as pathogenic by use of both female and male controls because the number of controls increases twofold from 11,241 to 23,731. To test this possibility, we combined both controls and determined clinical significance of all variants again. First, we focused the 1,781 variants found in women to check how the increased number of controls improved the determination of clinical significance. As in the Supplemental Table 10, we observed that only one variant (p.Leu3048Phe in ATM) changed from "uncertain significance" to "pathogenic" because this variant came to meet PS4 of the ACMG guidelines. As a result, the combining female and male controls did not change the pathogenicity of many variants.
Then, we performed gene-based analysis with 245 pathogenic variants found in women and 39 additional pathogenic variants found in only male controls in 7,051 cases and 23,731 female and male controls (Supplemental Table 11). As a whole, results were very similar to Table 2 analyzed in 7,051 cases and 11,241 female controls only. However, when we checked each gene separately, we observed that odds ratio of BRCA1 largely decreased from 33.0 to 20.5 because the frequency of controls with pathogenic variants increased from 0.04% to 0.07% by adding male controls. Among controls, men had more pathogenic variants in BRCA1 (0.1%) than women (0.04%). This result is consistent with the recent publication about male breast cancer 1 which showed BRCA1 was a low-risk gene (OR = 1.8). Therefore, female disease risk of BRCA1 would be underestimated.