Introduction

Germline mutations in the BRCA1 (GenBank: MIM 113705; http://www.ncbi.nlm.nih.gov) and BRCA2 (GenBank: MIM 600185) genes have been implicated as causal factors in 4–8% of all breast cancer cases occurring in women, and in more than 70% of early familial breast cancer (McClain et al. 2005). Since the identification of the BRCA2 gene sequence (Wooster et al. 1995), a large body of research has been dedicated to identifying BRCA2 mutations and their prevalence in patients of different populations as well as understanding the role of this gene in breast cancer, cellular homeostasis and carcinogenesis (McClain et al. 2005; Shivji and Venkitaraman 2004). Evaluation of BRCA2 germline mutations and polymorphisms is a key determinant in prognosis, risk assessment and therapeutic strategies for breast cancer and other BRCA2-related malignancies (for clinician guidelines, see National Cancer Institute, Breast Cancer Prevention; http://www.cancer.gov).

Materials and methods

The study was performed in accordance with current institutional ethical committee requirements.

Patients

In the present study, a total of 74 subjects were genetically screened, including 35 patients diagnosed with breast cancer, 24 non-breast cancer patients (19 colon cancer patients, 2 Crohn’s disease patients and 3 patients suffering from familial hypertrophic cardiomyopathy) as well as 12 non-affected breast cancer family members and 3 healthy individuals.

Genetic analysis

DNA was isolated from peripheral blood lymphocytes using standard methods or the FlexiGene DNA Kit (Qiagen, Hilden, Germany), according to the manufacturer’s instructions. The BRCA2 exon16 was amplified as described previously (Kataki et al. 2005).

Sequence analysis

The BRCA2 gene sequence region extending between exon 16 and part of the preceding intron 15 and the concomitant intron 16, as well as BRCA2 exon 15, were sequenced bidirectionally as described previously (Armakolas et al. 2002) in an automated system (Macrogen Sequencing Service). Primers designed to amplify into the the BRCA2 intron 15–exon 16 border covering the 16 bp region (according to the NHGRI BCIC database) were: forward primer: TGTGTAGGTGTTCTCATAAACAG; reverse primer: AAAGAGGGATGAGGAATAC (the 16 bp of interest are italicised).

Enzymatic mapping

The 325 bp PCR-amplified DNA from biological samples was digested with the restriction enzyme PvuII (New England Biolabs, Beverly, MA), which recognises the sequence CAGCTG.

Immunocytochemistry

BRCA2 protein integrity was monitored in MCF7 cells by immunofluorescence staining with a polyclonal antibody specific for the carboxy-terminal amino acids 3245—3418 of human BRCA2 protein (BRCA2 Ab-2; Calbiochem, San Diego, CA).

Results

During the course of a mutational analysis of the entire coding sequence of the BRCA2 gene in DNA isolated and PCR-amplified (Kataki et al. 2005) from peripheral blood lymphocytes from breast and ovarian cancer patients and non-affected breast cancer family members, we encountered some initially striking observations concerning two of these samples, both with a distinct family history of disease. Aligning sequencing data from these samples to the BRCA2 gene exon16 sequence listed in an established database (NHGRI Breast Cancer Information Core; http://www.research.nhgri.nih.gov), demonstrated a large deleterious deletion (nt7602del16) of 16 nucleotides (GTGTTCTCATAAACAG) in the intron15–exon16 border of the BRCA2 gene in all individuals. Literature and database searches revealed that the nt7602del16 mutation has been reported previously in a Japanese population genetics study (Katagiri et al. 1998) and is listed in the Human Gene Mutation Database (HGMB: CD982507; http://www.archive.uwcm.ac.uk). This nt7602del16 (aa2534–2539) mutation of BRCA2 exon16 was detected in four members of a breast-cancer-affected family (Katagiri et al. 1998) and was regarded as pathogenic as it results in an early termination codon at aa2545 of the BRCA2 protein.

However, the sequencing data in our study provided proof that these nt7602del16 carriers were actually homozygous with regard to this mutation (sequencing data in this region has no superimposing peaks; Fig. 1a). This motivated us into performing more extensive tests on another six randomly selected samples. Since the sequencing data provided evidence of the same homozygous deletion in all DNA samples tested, an even larger number of subjects of broader origin—including a random choice of 35 breast cancer patients and 12 non-affected breast cancer family members previously screened for entire BRCA2 gene mutations (Armakolas et al. 2002), as well as another 19 colon cancer patients, 2 Crohn’s disease patients, 3 familial hypertrophic cardiomyopathy patients and 3 healthy individuals—was investigated by enzymatic mapping of the 325 bp PCR-amplified DNA region of interest. The restriction enzyme PvuII recognizes the sequence CAGCTG, which is expected to be intact in samples containing the 16 nucleotides of interest in BRCA2 exon16. PvuII failed to digest all of the studied samples, indicating that none of them contain this 16 bp region in exon16 (Fig. 2). In parallel, a new pair of PCR primers specific for the described BRCA2 intron 15–exon 16 sequence was designed and used in six representative samples. However, there seem to be no target sequence present in any of the studied samples, since this sequence failed to amplify in all tested DNA.

Fig. 1a,b
figure 1

Representative sequence analyses of intron 15–exon 16 and exon 15–intron 15 region of the BRCA2 gene. a Sequencing data of the PCR-amplified intron 15–exon 16 DNA region from MCF7 cells; lack of superimposing peaks signifies homozygosity. b Sequence data of the BRCA2 exon 15–intron 15 region from a breast cancer patient blood sample containing the 16 nucleotides without indications of heterozygosity. The 16 bp sequence of interest is indicated by a line above the corresponding nucleotides

Fig. 2
figure 2

Example of PvuII-restriction analysis. A random selection of PvuII-digested PCR-amplified 325 bp DNA regions of BRCA2 exon 16 from diverse samples of breast cancer and non-breast cancer patients (lanes 16, 815), healthy individuals (lanes 1618) and MCF7 cells (lane 19). Lane 7 Negative control. PvuII treatment did not produce the expected digestion products (244 and 81 bp) in any of the samples, indicating absence of the enzyme’s recognition site in the DNA region of interest. Sizes (bp) of DNA-size markers are noted

All the screening tests were also performed on DNA isolated from MCF7 cells, which have repeatedly served as a BRCA2-positive control in a variety of studies (Data Sheet PC146 Rev.13/9/2004, Calbiochem; Vissac et al. 2002). According to these results, MCF7 cells also contain the nt7602 16 bp deletion, a fact indicative of expression of a BRCA2 protein truncated at aa2545 in these BRCA2-positive cells. Immunocytochemical staining of MCF7 cells with an antibody specific for the carboxy-terminal aa3245—3418 of human BRCA2 protein was performed. As expected, but contrary to all previous indications, MCF7 cells were clearly cross-reactive with the BRCA2 antibody, illustrating the unequivocal expression of an integral and functional BRCA2 protein in MCF7 cells.

In summary, our experimental data show that all subjects including breast cancer patients and healthy subjects, were homozygous carriers of the BRCA2 nt7602del16 mutation, while expression of intact BRCA2 protein was demonstrated in samples expected to contain a BRCA2 protein truncated at aa2545 if the deletion existed. These observations directed our efforts towards investigating the validity of the information listed in the databases. Additional sequence analysis of the exon 15 region of BRCA2 in a selection of the previously screened samples revealed that the 16 bp sequence we suspected as deleted when listed as part of BRCA2 intron 15–exon 16, is actually located in the exon 15–intron 15 border of the BRCA2 gene (Fig. 1b). In particular, the GTGTTCTCATAAACAG sequence represents the terminal nucleotides of BRCA2 exon 15 but have been erroneously allocated to the intron 15–exon 16 sequence in certain highly referenced database (Fig. 3). This conclusion from our research data was ultimately affirmed by sequence data provided in National Center for Biotechnology Information database link (NCBI Evidence Viewer, BRCA2 exon16: NT024524.13; U43746; http://www.ncbi.nih.gov/mapview).

Fig. 3
figure 3

a Incorrect alignment of 16 nucleotides (shaded area) in the borders of BRCA2 intron 15–exon 16 has resulted in a misquoted mutation (CD982507). b Correct BRCA2 exon 15–intron 15–exon 16 sequence: the shaded 16 bp are the final nucleotides of BRCA2 exon 15. The corresponding BRCA2 protein amino acid sequence is shown below each DNA sequence. Boxed nucleotides PvuII restriction site

Discussion

The extensive clarification of our initial observations regarding the presence of the nt7602del16 mutation in exon16 of the BRCA2 gene in a total of 74 subjects and the MCF7 cell line, verified that the nt7602del16 has been misquoted as a mutation. The listed (HGMB: CD982507) nt7602del16 mutation does not exist. Incorrect placement of this sequence the end of BRCA2 exon15 in this database apparently accounts for misinterpretation of this sequence alignment as a deletion in addition to a ccc→ctc alteration listed as intronic (int15+5) mutation in the same study (Katagiri et al. 1998).

Publicly accessible sequence databases and BRCA1/2 polymorphisms and mutation registries (e.g. Breast Cancer Information Core; Szabo et al. 2000) are indispensable tools both for clinicians and geneticists as reference data (see Wolfsberg et al. 2003). Notably, in the case of BRCA1/2 pathogenic mutation carriers, risk-management strategies dictate treatment designs, including drastic therapeutic interventions such as prophylactic mastectomy (Rebbeck et al. 2004). This alone highlights the role of databases in areas that extend beyond researcher confidence, even reaching public health issues. Our experience in screening high-risk populations for BRCA1/2 mutations (Kataki et al. 2005; Armakolas et al. 2002) has taught us diligence and vigilance in interpreting scientific data by means of databases. These reported population-specific polymorphisms, pathogenic mutations and genotype–phenotype correlations acquire reference value for numerous investigators involved in defining the genetic background of breast cancer, and therefore, data accuracy is critical. Optimising the information provided in databases is an ongoing interactive process crucial to genetic epidemiological research.