Introduction

The identification of BRCA1 and BRCA2, that took place two decades ago, opened a new era in cancer diagnosis and prevention, providing the foundation for hereditary cancer susceptibility [1,2,3]. Approximately 30% of high-risk breast and/or ovarian cancer families harbor deleterious mutations in these genes, suggesting that other genes predispose to hereditary breast cancer. Implementation and broad use of multigene panels for clinical genetic testing enabled the identification of multiple mutations in genes associated with high or moderate breast cancer susceptibility [4,5,6,7,8]. Of these, CHEK2 are apparently the most frequent beyond BRCA1 and BRCA2 mutations among breast cancer patients of various ethnicities [4, 9]. CHEK2 encodes for a multifunctional serine/threonine protein kinase that is involved in several cellular processes, of which DNA repair through homologous recombination and maintenance of genomic stability, being rather critical. Specifically, CHEK2 activation after DNA damage consequently leads to downstream interaction with BRCA1, BRCA2, and TP53 [10].

Germline loss-of-function (LoF) CHEK2 mutations have been associated with moderate breast cancer risk, with the exact risk being variable depending on the specific mutation. More specifically, the lifetime breast cancer risk associated with the CHEK2 c.1100delC mutation, which is founder for the Eastern Europeans, is estimated to be around 25–30% and is the most well-studied CHEK2 mutation, so far [11, 12]. On the contrary, a number of other CHEK2 mutations, such as p.Ile157Thr and p.Ser428Phe, seem to confer lower breast cancer risks estimated to be around 18% [9, 13]. The majority of breast cancer tumors arising in CHEK2 carriers are luminal, characterized by high expression of estrogen and progesterone receptors, arising from the ductal cells [14,15,16]. Noteworthy, in multiple studies, age at cancer diagnosis of CHEK2 carriers seems to be relatively young, with two-thirds of them being diagnosed before the age of 50 years [9, 17].

Interestingly, there seems to be an association of CHEK2 mutations with increased risk for diagnosis of other malignancies such as colorectal, thyroid, prostate, kidney, gastric, and bladder cancer, suggesting that CHEK2 is a multiorgan cancer susceptibility gene [18,19,20].

Up to date, a number of CHEK2 gene variations have been identified, including all types of events, from substitutions and insertions/deletions of a single or some base pairs to Large Genomic Rearrangements (LGRs), involving deletion/duplication of some kilobases. The latter seem to occur rather rarely, or is underreported due to the necessity of additional experimentation. Interestingly, eight CHEK2 LGRs have been published to date; namely, a duplication encompassing exons 6–13 [21] and eight deletions involving one or more coding exons [22, 23]. Of these, the founder Czechoslovakian 5.6 kb deletion of exons 9 and 10 (del5395), reported to make a substantial contribution to breast cancer patients of Polish descent, is the only molecularly characterized [17, 24]. Therefore, in this study we sought to molecularly define and determine the contribution of two rare, apparently novel genomic events, which involve the in-frame deletions of exon 6 and exons 2 & 3 of CHEK2, initially identified, in a Greek high-risk breast cancer cohort.

Materials and methods

Patient and control study group

This study included 2355 Greek female breast cancer cases and 1580 healthy age-matched females. The mean age at breast cancer diagnosis was 54.6 years (range: 20–70 years). Although, the patients have not been selected based on age or family history, an ascertainment bias does exist due to the Molecular Diagnostics Laboratory’s (MDL) expertise on hereditary cancer. Individuals have been referred to MDL of NCSR ‘Demokritos’ from ‘Mitera’ Hospital and collaborating oncology clinics with the Hellenic Cooperative Oncology Group. The study was approved by the Bioethics Committees of NCSR ‘Demokritos’ (240/EHΔ/11.3) and Papageorgiou Hospital (193rd Decision of Bioethics Committee) in compliance with the 1975 Helsinki declaration. Written informed consent was requested and signed from all patients prior to genetic analysis.

DNA and RNA extraction

Genomic DNA and RNA were isolated from peripheral blood lymphocytes using the salt extraction protocol proposed by Miller [25] and Trizol reagent (Invitrogen, ThermoFisher Scientific, Carlsbad, CA, USA), respectively.

Determination of the two novel CHEK2 LGRs breakpoints

The genomic breakpoints of the two novel CHEK2 LGRs were determined using the TaKaRa LA Taq Long range PCR system (Takara Bio Inc, Kyoto, Japan), according to the manufacturer’s protocol, while PCR products were electrophoresed on a 0.6% agarose gel. A nested PCR, with subsequent PCR amplification gave rise to smaller DNA fragments, which were then sequenced using the v3.1 BigDye Terminator Cycle Sequencing kit on a ABI 3130XL Genetic Analyzer (ThermoFisher Scientific, Carlsbad, CA, USA).

Screening for CHEK2 p.Glu107_Lys197del, p.Asp265_His282del, and del5395bp LGRs

A custom-designed 25 μl PCR reaction (Biotools B&M Labs, S.A, Madrid, Spain) was subsequently used for the detection of each of the three CHEK2 LGRs. In the case of p.Glu107_Lys197del PCR products of 682 and 563 bp represented the mutant and the wild-type CHEK2 allele, respectively. In the case of p.Asp265_His282del, PCR products of 814 and 464 bp represented the mutant and wild-type allele, respectively. In the case of CHEK2 del5395 PCR fragments of 464 and 1872 bp represented the mutant and the wild-type allele, respectively. Primers and protocols are available upon request.

Reverse-transcriptase PCR

A total of 500 ng of RNA were reverse transcribed using oligo-dTs and MMLV reverse-transcriptase kit (Invitrogen, Thermo Fischer Scientific, Carlsbad, CA, USA), following manufacturer’s instructions.

Haplotype analysis

Haplotype analysis was performed in all p.Asp265_His282del carriers along with fifty-two cancer-free, age-matched women, using seven markers. Specifically, four microsatellite tandem repeats (D22S1150, D22S275, D22S689, and D22S1163) and three single nucleotide polymorphisms (SNPs) (rs5762795, rs6005863, and rs5762764) were used. All markers, apart from D22S275 and rs5762764, are extragenic, spanning a 1850 kb region around CHEK2. The physical distances of the genetic markers were obtained from UCSC Genome Bioinformatics (https://genome.ucsc.edu). The forward primer of each set was labeled with either 6-FAM or HEX, while the fluorescently labeled PCR products were electrophoresed on an ABI 3130XL Genetic Analyzer standardized with ROX-500 (ThermoFisher Scientific, Warrington, UK) and analyzed using the GeneScan 3.1 software (ThermoFisher Scientific, Warrington, UK).

Age estimation of CHEK2 p.Asp265_His282del

The DMLE2.2 software program was used to estimate the age of CHEK2 p.Asp265_His282del. This method is based on the observed linkage disequilibrium between a disease mutation and linked markers in DNA of mutation carriers. The program uses the Markov Chain Monte Carlo algorithm for Bayesian estimation of the mutation age [26]. The population growth rate was 0.135 and was based on demographic data, assuming a time interval of 25 years per generation.

CHEK2 yeast functional assay

The pathogenicity of both novel LGRs was also evaluated using an in vivo CHEK2-mediated functional assay, which was performed in MDL in a modified version of previously published methods [27, 28]. The model system used is Saccharomyces cerevisiae, as the yeast homologous protein RAD53 can be partially functionally complemented by human CHEK2 and participates in response to DNA damage [29]. The yeast strain used lacks RAD53, as well as SML1 gene (strain W2105–17b: MATa sml1Δ::URA3 rad53Δ::HIS3 RAD5 leu2-3, 112 trp-1-1 can1-100 ura3-1 ade2-1 his3-11, 15) and was provided by Dr. R Rothstein [30]. The plasmid pmh267 (pBAD101, 2μ LEU2 GAL-CHEK2), which represents the wild-type CHEK2 gene and was used as positive control, was provided by Dr. Steven Elledge [29]. The negative control used in this assay was a plasmid carrying the c.1100delC mutation. Plasmids carrying the deletion of exons 2–3, deletion of exon 6 or c.1100delC were created using the Q5® Site-Directed Mutagenesis Kit (New England Biolabs, Ipswich, Massachusetts, USA). The Frozen EZ Yeast Transformation kit (Zymo Research, Irvine, California, USA) was used for the creation and transformation of competent yeast cells (Delimitsou A. et al. 2018, unpublished data).

Results

Two novel CHEK2 genomic rearrangements and more specifically, a ~6 kb deletion of exons 2 & 3 and a ~7 kb deletion of exon 6 of the gene were initially detected through analysis in a high-risk breast cancer risk cohort (Fostira F et al. 2018, unpublished data). In order to address the prevalence of these CHEK2 LGRs among Greek breast cancer patients, 2355 index cases were screened.

Molecular characterization and prevalence of CHEK2 p.Asp265_His282del mutation

In order to characterize the large deletion encompassing exon 6, the forward primer was designed ~1 kb upstream of exon 5 and the reverse ~600 bp downstream of exon 7. Therefore, PCR products of 8868 bp and 1300 bp were expected to represent the wild-type and mutant alleles (Fig. 1a). Through Sanger sequencing, the exact size of the deletion was determined to be 7566 bp, while the mutation nomenclature following the HGVS rules [31] was defined as c.793_846del7566 (NM_007194.3) and p.Asp265_His282del (NP_009125.1) at the cDNA and protein level, respectively.

Fig. 1
figure 1

Molecular characterization of CHEK2 LGRs. a Wild-type (8868 bp) and mutant (1300 bp) alleles of a p.Asp265_His282del mutation carrier, visualized on an 0.8% agarose gel. b cDNA analysis of the p.Asp265_His282del mutation, where the 743 bp and 689 bp bands correspond to wild-type and mutant allele, respectively. c Electropherogram of cDNA of p.Asp265_His282del mutation carrier by Sanger sequencing. d PCR products visualized on an agarose gel by the customized PCR designed for screening for the p.Asp265_His282del mutation, where the 814 bp and 464 bp were produced for the mutant and the wild-type CHEK2 allele, respectively. e PCR products visualized on an agarose gel by the customized PCR designed for screening for the p.Glu107_Lys197del mutation, where the 682 bp and 563 bp were produced for the mutant and the wild-type CHEK2 allele, respectively

Further characterization of the mutation was assessed through RNA analysis, where cDNA amplification revealed two fragments of 743 and 689 bp (Fig. 1b), corresponding to the wild-type and mutant allele, respectively, indicating the mutation causes in-frame skipping of exon 6 (Fig. 1c), which is located within the protein’s kinase domain. This finding was also confirmed by Sanger sequencing.

Among our cohort, 0.22% (5/2355) of patients and none of the 1580 controls tested carried the CHEK2 p.Asp265_His282del mutation (Fig. 1d). All mutation carriers reported at least one family relative diagnosed with breast cancer, while the mean age at breast cancer diagnosis was 49.6 years (range 38–57 years). Detailed pedigrees of mutation carriers are illustrated in Fig. 2, while all histopathological characteristics are summarized in Table 1.

Fig. 2
figure 2

Pedigrees of CHEK2 LGRs carriers. Families 1070, 1136, 1933, 1937, and 1938 carry the p.Asp265_His282del mutation, while family 1081 carries the p.Glu107_Lys197del mutation. Probands are represented by the arrow, while breast cancer patients are colored in black. BrCa breast cancer, Ca cancer, CRC colorectal cancer, OvCa ovarian cancer, PrCa prostate cancer, WWII world war II

Table 1 Histopathological characteristics of CHEK2 p.Asp265_His282del carriers

Molecular characterization and prevalence of the CHEK2 p.Glu107_Lys197del

In order to detect the boundaries of the rearrangement encompassing exons 2 & 3, the forward primer was designed ~5.1 kb upstream of exon 2 and the reverse primer was designed ~1.8 kb downstream of exon 3. Subsequently, two PCR fragments of 7335 and 1500 bp were amplified, corresponding to the wild-type and the mutant allele, respectively (Fig. 1e). The size of the deletion was determined as 6160 bp by Sanger sequencing and based on HGVS nomenclature, was defined as: c.320_592del6160 (NM_007194.3) and p.Glu107_Lys197del (NP_009125.1) on the cDNA level and protein level, respectively. Unfortunately, the characterization at RNA level was not possible due to lack of a fresh blood sample from the mutation carrier.

In total, 1020 Greek early onset breast cancer patients (all diagnosed <45 years), were tested for the p.Glu107_Lys197del mutation, using a custom-designed PCR. None of them carried the aforementioned deletion, so further analysis was not pursued.

Mutation analysis of CHEK2 del5395 mutation

None of the 2355Greek patients carried the CHEK2 del5395 mutation, through the analysis using a customized PCR assay.

Haplotype analysis for the CHEK2 p.Asp265_His282del mutation

Haplotype analysis was performed for six mutation carriers (five index patients and one healthy carrier), who were available for testing (Table 2). To assess the population allele frequencies of the polymorphic microsatellite markers, 104 chromosomes of healthy age-matched Greek women were analyzed (Suppl. Table 1). The allele distribution among mutation carriers and controls is different, while a disease-associated haplotype for CHEK2 carriers indicate a single source of the CHEK2 genomic rearrangement. All the findings are summarized in detail in Table 2. More specifically, the haplotype associated with the CHEK2 p.Asp265_His282del mutation among SNPs rs5762764, rs6005863, and rs5762795 is ‘G-A-C’, while among the microsatellite markers D22S1163-D22S1150 is ‘4–4–1–6’ (Table 2). Overall, a region of 1850 kb (4-4-1-G-A-C-6) is shared between all mutation carriers. Additionally, the same allele of the intragenic marker D22S275 is shared among carriers, indicating a common ancestor.

Table 2 Haplotype analysis for SNPs and microsatellite markers of CHEK2 p.Asp265_His282del carriers (1136, 1136a, 1070, 1933, 1937, and 1938) and family relatives that are non-carriers (1136b, 1136c). Alleles segregating with the disease appear in red

Mutation age estimation

The age of the CHEK2 p.Asp265_His282del mutation was estimated by analyzing relatives from five families, including six mutation carriers. The mutation origin is estimated to have occurred 25–62 generations ago, which approximately corresponds to 625–1550 years (r = 0.135). Therefore, the haplotype associated with the CHEK2 p.Asp265_His282del mutation was, on average, introduced in the population 39 generations ago (975 years).

Prediction of structure–function consequences of CHEK2 LGRs

To investigate the consequence of CHEK2 p.Glu107_Lys197del and p.Asp265_His282del mutations at protein level, the amino acids involved were mapped on the known crystal structure of human CHEK2 (PDB ID: 3I6U) [32]. The p.Glu107_Lys197del mutation almost deletes the entire FHA domain of CHEK2 (illustrated in Fig. 3a). Although the corresponding region (aa: 107–197) does not include the dimerization interface per se, its deletion results to a disruption of the FHA domain, which in turn is predicted to affect the structural integrity of the dimerization interface.

Fig. 3
figure 3

Prediction of the structural consequences of the CHEK2. a p.Glu107_Lys197del and b p.Asp265_His282del at the protein level. a On the left-hand side, a CHEK2 dimer is shown, as extracted from the known crystal structure (PDB ID: 3I6U) [32]. On the right-hand side, the FHA region deleted in the p.Glu107_Lys197del mutation is shown. b The amino acid sequence deleted in the case of the CHEK2 p.Asp265_His282del is shown, on the known crystal structure (P212121 crystal form) of the kinase domain of human CHEK2 (PDB ID: 3I6U) [32]. Only one monomer (chain A) is shown, for clarity. On the right-hand side, the crystal structure of an active form of cAMP-dependent protein kinase (PDB ID: 1ATP) [43] is shown, while the regulatory helix αC, functional motifs and conserved residues of protein kinases are labeled, with the conserved, in active kinases, Glu-Lys salt-bridge labeled. The Figure was illustrated using Pymol

On the other hand, CHEK2 p.Asp265_His282del results to a deletion of an α-helix in the CHEK2 kinase domain (Fig. 3b). Interestingly, this αC helix is one of the conserved functional motifs of protein kinases, the conformation of which plays a pivotal role in the regulation of their function [33, 34]. In their active conformation, at least two important residues are required: (i) a conserved Glutamine residue located in the middle of the helix (Glu273 in CHEK2; Fig. 3b) that forms a conserved salt bridge with a Lys residue (Lys249 in CHEK2, arginine in the 3I6U entry; Fig. 3b), essential for the correct positioning of the ATP phosphate groups and therefore for the phosphotransfer from ATP to protein-substrates and (ii) a conserved hydrophobic residue at the C-terminus of the helix (Leu280 in CHEK2), which is part of the regulatory spine, a spatial motif which is assembled and disrupted in active and inactive kinases, respectively [35, 36]. Residues corresponding to E273 and L280 of the CHEK2 αC-helix are indispensable for the activation of all kinases [37]. In addition, the packing of hydrophobic residues around helix αC has been proposed to be the driving force for the activation of kinases and mutations enhancing the hydrophobic character of this region are linked to cancer [38]. Taken together these observations suggest that the CHEK2 p.Asp265_His282del mutation, which corresponds to a deletion of the central to function αC-helix, yields a non-functional version of the CHEK2 kinase.

In silico analysis

MutationTaster and PROVEAN (Protein Variation Effect Analyzer) software tools were used to evaluate the disease-causing potential of the CHEK2 LGRs, as well as to predict possible impact on the biological function of the CHEK2 protein [39, 40].

CHEK2 yeast functional assay

The growth of the yeast strains carrying CHEK2 p.Glu107_Lys197del and p.Asp265_His282del mutations was comparative with that of the negative control and significantly different to that of the wild-type CHEK2. Therefore, the CHEK2 proteins produced by strains with these mutations cannot complement the loss of RAD53 activity and result in a decrease of cell proliferation, to a great extent (Fig. 4).

Fig. 4
figure 4

CHEK2 yeast functional assay. The growth of yeast strains carrying CHEK2 p.Glu107_Lys197del (Fig. 4a) and p.Asp265_His282del (Fig. 4b) is similar to that of strains carrying CHEK2 loss-of-function mutation (c.1100delC) and significant less than those carrying the wild type (pmh267), after three measurements in 12, 17, and 22 h

Discussion

This is the first detailed investigation of two novel CHEK2 LGRs, where the molecular characterization, followed by the prevalence in Greek breast cancer patients, were determined. Although rare, the in-frame deletion of CHEK2 critical domains caused CHEK2 p.Asp265_His282del and p.Glu107_Lys197del mutations, could explain a small, but not negligible proportion of breast cancer susceptibility among patients of Greek descent. Of these, p.Glu107_Lys197del resulting in the deletion of exons 2 & 3, was only seen once, indicating its rarity as a genomic event. Interestingly, a similar deletion, encompassing exons 2 and 3, has been previously reported but has not been molecularly characterized [23].

On the contrary, p.Asp265_His282del was detected in 0.22% among the breast cancer cases tested, suggesting its association to breast cancer. CHEK2, p.Asp265_His282del mutation results in the production of an aberrant isoform on RNA level and in the impaired CHEK2 function, as shown by a S. cerevisiae functional assay. The prediction through the protein structural model is also in agreement with the mutation’s damaging effect, since CHEK2’s kinase activity is abolished through the deletion of an α-helix within the kinase domain. Haplotype analysis showed that the CHEK2 p.Asp265_His282del mutation is a Greek founder mutation being introduced in the population ~975 years ago, while it seems to have originated in the Western part of Greece (Ioannina, Arta, and Patras).

Interestingly, CHEK2 p.Asp265_His282del LGR is more frequent than the c.1100delC mutation among Greek breast cancer cases [41], with the majority of carriers being diagnosed at an early age and/or had family history of breast cancer. Through this study, the importance of population-specific studies is highlighted and considering the rarity and the additional experimentation needed of such mutational events, it is possible that carriers could have been missed by conventional methods. Although the rarity of the mutation makes the estimation of cancer risk quite difficult, it is possible that breast cancer risk conferred by CHEK2 LGRs can be similar to the risk of CHEK2 truncating mutations and therefore, clinically important. Furthermore, the elevated risk for other cancer types reported for CHEK2 mutation carriers cannot be assessed for the specific LGR, due to the limited number of carriers. Further studies, including larger number of patients, are essential to determine the actual cancer risks and phenotypic spectrum.

The well-studied CHEK2 LGR, del5395, was not detected in any Greek breast cancer cases tested, which is in consistence with the rarity of this allele among countries that are geographically close to Greece [42]. CHEK2 del5395 seems to be more prevalent in the Czech and Slovak Republics, as well as Poland, where it was identified in ~1% of breast cancer patients [17, 24].

In conclusion, the present study highlights the existence of rare genomic events in breast cancer predisposing genes, other than BRCA1 & BRCA2. These events might not be as rare as currently believed, since they are not routinely assessed. If their prevalence and cancer risk is determined and validated in larger cohorts, these can lead to tailored clinical management of mutation carriers. Next-generation sequencing technologies have the potential to implement simultaneous identification of such mutations in genes that predispose to breast cancer. Testing for multiple genes with a combination of methods is at the moment the best choice for assessing cancer predisposition in families with strong family history.