A search for modifying genetic factors in CHEK2:c.1100delC breast cancer patients

The risk of breast cancer associated with CHEK2:c.1100delC is 2–threefold but higher in carriers with a family history of breast cancer than without, suggesting that other genetic loci in combination with CHEK2:c.1100delC confer an increased risk in a polygenic model. Part of the excess familial risk has been associated with common low-penetrance variants. This study aimed to identify genetic loci that modify CHEK2:c.1100delC-associated breast cancer risk by searching for candidate risk alleles that are overrepresented in CHEK2:c.1100delC carriers with breast cancer compared with controls. We performed whole-exome sequencing in 28 breast cancer cases with germline CHEK2:c.1100delC, 28 familial breast cancer cases and 70 controls. Candidate alleles were selected for validation in larger cohorts. One recessive synonymous variant, rs16897117, was suggested, but no overrepresentation of homozygous CHEK2:c.1100delC carriers was found in the following validation. Furthermore, 11 non-synonymous candidate alleles were suggested for further testing, but no significant difference in allele frequency could be detected in the validation in CHEK2:c.1100delC cases compared with familial breast cancer, sporadic breast cancer and controls. With this method, we found no support for a CHEK2:c.1100delC-specific genetic modifier. Further studies of CHEK2:c.1100delC genetic modifiers are warranted to improve risk assessment in clinical practice.


Scientific Reports
| (2021) 11:14763 | https://doi.org/10.1038/s41598-021-93926-x www.nature.com/scientificreports/ 70 years [8][9][10] . In addition, the c.1100delC allele has been associated with younger age at onset, a threefold increased risk of a second breast cancer, as well as a worse prognosis among women with oestrogen receptor-positive cancer 9,11,12 . The considerably higher risk in women with a family history of breast cancer is in accordance with the suggested polygenic model where several susceptibility loci together confer a multiplicative effect on breast cancer risk 13,14 . The fact that the model also can be applied to CHEK2:c.1100delC carriers is supported by a study of low-risk breast cancer variants in 34 000 women with and without a family history of breast cancer. A polygenic risk score (PRS) that was based on the combined risk of 74 low risk variants was calculated. The result suggested that the polygenic risk score could be used to stratify risk in c.1100delC carriers and that the low-risk variants explained a part of the familial risk. The authors estimated that 20% of CHEK2:c.1100delC carriers with the highest PRS had an estimated lifetime breast cancer risk of > 30%. Correspondingly, 20% of carriers with the lowest PRS had an estimated lifetime risk of 14% which is close to the average population risk 15 . A synergistic effect between low-risk variants and BRCA1 and BRCA2 mutations has also been shown 16 . The risk for mutation carriers being affected is thus modified by other genetic variants and family history in addition to lifestyle factors. A risk prediction model, the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) has been developed to calculate the lifetime risk of breast cancer, including carriers of a moderate-penetrance allele such as CHEK2:c.1100delC. The BOADICEA model allows risk stratification for established genetic and non-genetic risk factors 17 . Still, other causative gene variants possibly remain to be identified, since the previously identified low-, intermediate-, and high-risk genes cover less than half of the estimated heritable component. Characterising factors that increase the risk in carriers of moderate-risk alleles is important, in order to identify the high-risk group that benefits most from preventive interventions. In this study, we used whole-exome sequencing of a CHEK2:c.1100delC positive cohort with familial breast cancer, to identify putative risk modifying alleles. In the first phase we aimed to find candidate risk alleles for further validation in the second phase with larger cohorts of CHEK2:c.1100delC positive cases and controls.

Results
We performed whole-exome sequencing in 28 breast cancer cases with germline CHEK2:c.1100delC, 28 familial breast cancer cases and 70 controls. Candidate alleles were selected for validation in larger cohorts (Fig. 1).

Recessive variants.
We analysed the exome sequencing data for a discovery of rare homozygous variants in CHEK2:c.1100C carriers, to identify risk alleles with recessive inheritance pattern. Only one variant was suggested, rs16897117. Among the 28 CHEK2 carriers, there were 3 patients homozygous for rs16897117, whereas among the non-carrier breast cancer cases or healthy controls, there were no rs16897117 homozygotes. We set up to test the hypothesis of rs1689711 being a CHEK2:c.1100C risk modifier in larger sample collections, starting with 67 CHEK2 patients, as well as 688 non-carrier breast cancer cases and 246 healthy controls. This study confirmed the skewed allele distribution, with fewer individuals heterozygous for rs16897117 among the CHEK2 patients than among non-carrier patients or healthy controls. In a case-only analysis, the odds ratio between rs16897117 rare allele (A) and CHEK2:c.1100delC was 0.46 (95% confidence interval CI 0.17-1.04, P 0.053 (Table 1: SWEA1).
Next, we did another follow-up using 45 CHEK2 carriers plus 87 familial breast cancer patients and 47 controls from the Swedish cohorts. None of the CHEK2 carriers or the familial breast cancer patients were found to be homozygous for the rs16897117 variant. The only two homozygous individuals of this follow-up were identified in the control group. No skewness in allele distribution was observed in any of these groups (Table 1: SWEA2). The results seemed less clear, but to resolve this, we tested the association between rs16897117 and c.1100delC in a Finnish population, where the c.1100delC allele has a relatively high, 1.2%, frequency 18 . Genotyping of three independent patient series identified a single c.1100delC carrier patient, who was homozygous for rs16897117. The skewed allele distribution for rs16897117 was observed in the Helsinki cohorts, but not in the Tampere cohort. A study-stratified OR for association between rs16897117 and c.1100delC, combining all cohorts from Sweden and Finland, was 0.69 (95% CI 0.46-1.03, P 0.073), encouraging further analysis.
Finally, the genotype data for rs16897117 and c.1100delC were obtained from the OncoArray project of the Breast Cancer Association Consortium 5 . The availability of a good number of healthy c.1100delC carriers in the consortium data enabled a proper interaction analysis for c.1100delC, rs16897117, and breast cancer risk. In the BCAC data, there was no allelic imbalance between rs16897117 and c.1100delC (Table 2). A likelihood-ratio test comparing a breast cancer risk model with c.1100delC-rs16897117 interaction term with a plain model with c.1100delC and rs16897117 as independent risk factors did not support rs16897117 as a dosage-dependent risk modifier for c.1100delC carriers ( Table 3). The BCAC data included four c.1100delC carriers, who were homozygous for rs16897117. These were all breast cancer cases, but the sample counts were too low for a reliable analysis.
Coding non-synonymous candidate variants. In the discovery phase, exome sequencing data were analysed with a set of criteria in search of CHEK2:c.1100delC candidate variants. Fourteen non-synonymous variants were subject for testing, but only 11 were analysed due to technical issues with TaqMan probes ( Table 4). The 11 missense variants detected in the CHEK2:c.1100delC carriers were evaluated in the validation phase. None of the variants could be replicated with similar patterns as in the discovery phase (Table 5). Thus, none was suggested to be a modifier of breast cancer risk in CHEK2:c.1100delC carriers.

Discussion
We aimed to identify candidate risk variants that specifically modify risk in CHEK2:c.1100delC carriers through whole-exome sequencing of a small number of samples followed by validation in a case-control association study.
No CHEK2:c.del1100C-specific candidate variants could be identified. Previously identified variants that modify breast cancer risk in CHEK2:c.1100delC carriers are also risk variants in the general breast cancer population. The common low-risk variants that predispose to breast cancer have also shown synergistic effects with CHEK2 14 .
To our knowledge, no other genetic modifiers of CHEK2:c.1100delC have been suggested. Previously identified common alleles, associated with breast cancer in the general population have also been shown to modify risk in BRCA1 and BRCA2 mutation carriers, in a subtype specific manner 16 . A recent GWAS identified several novel loci that were associated with at least one tumour feature (ER-status, progesterone receptor status, tumour grade, human epidermal growth factor 2 receptor) and also loci that differed by the molecular subtype, luminal or non-luminal, of breast cancer 19 . The observations imply that tumour features should be taken into account when searching for candidate variants in CHEK2:c.del1100C carriers. Several loci that specifically modify risk in BRCA1 and BRCA2 carriers have also been found 16,[20][21][22][23][24][25][26][27][28][29] . These are all low-risk susceptibility alleles identified through testing of candidates from breast cancer genome-wide association studies in BRCA1/2 mutation carriers and through fine-mapping of candidate regions. www.nature.com/scientificreports/ Future studies of CHEK2:c.1100delC modifying candidates could be done with more loose criteria in the discovery phase to increase the probability of finding good candidates for further testing. In accordance with previous findings, gene-specific modifiers are likely to be common low-risk variants. CHEK2:c.1100delC-specific modifiers may then rather be identified through large-scale genome-wide association studies. With this method, we found no support for a CHEK2:c.1100delC-specific genetic modifier. More studies of CHEK2:c.1100delC genetic modifiers are therefore warranted to improve risk assessment in clinical practice.

Methods
In order to identify candidate variants, we conducted a discovery phase, where whole-exome sequencing was performed in 28 CHEK2:c.1100delC carriers with familial breast cancer, another 28 familial breast cancer patients and 70 healthy controls (spouses of colorectal cancer patients) from the Swedish cohorts. Candidate variants were validated in larger cohorts (Fig. 1).
Sample preparation, discovery phase. Genomic DNA was subjected to whole-exome sequencing at the National Genomics Infrastructure in Uppsala, Sweden. Exome-enriched sequencing libraries were prepared using the Agilent SureSelectXT Human All Exon V5 XT2 + UTR kit (Agilent, Santa Clara, California, US). Cluster generation and 125 cycle paired-end sequencing was performed using the Illumina HiSeq 2500 system and v4 sequencing chemistry (Illumina, San Diego, California, US). Next-generation sequencing was performed at SciFiLab, University of Uppsala.   Allele frequency. Ratios of the allele frequencies of the variants were calculated. A ratio of 2.0 or more between CHEK2:c.1100delC cases and healthy controls and/or a ratio of 1.5 or more between CHEK2:c.1100delC cases and familial breast cancer cases was required.
Gene function. Genes/variants that were selected should display a function of a putative cancer driver gene when evaluated by online genome browser databases (OMIM, GeneCards) and scientific publications available on PubMed.
Reference databases. A more than 30% higher allele frequency in CHEK2:c.1100delC carriers compared with regional reference databases was required (ExAC non-Finnish population, 1000genome2014oct European, SweGen Variant Frequency Browser, exome sequencing data from 200 Danes 30 and anonymous exome data from a cohort of 249 controls from the Department of Clinical Genetics, Karolinska University Hospital).

Sequencing accuracy.
Only variants with a sequencing accuracy of 65%, or more, in all study groups were included. The variants passing the selection criteria were functionally annotated using the in silico tools SIFT, Polyphen2 HDIV/HVAR, LRT, MutationTaster, FATHMM, RadialSVM, LR, and MutationAssessor.
Genotyping of a recessive candidate allele. Exome sequencing data were analysed in search of recessive candidate variants in CHEK2:c.1100delC carriers. One recessive variant, rs16897117, was suggested, as among the 28 CHEK2 carriers, there were 3 patients homozygous for rs16897117, whereas among the non-carrier breast cancer cases or healthy controls, there were no rs16897117 homozygotes. The rs16897117 was further evaluated in Swedish and Finnish cohorts and in data from the Breast Cancer Association Consortium, BCAC.
Swedish cohorts. The