The C-terminal BRCT region of BRCA1 is essential for its DNA repair, transcriptional regulation and tumor suppressor functions. Here we determine the crystal structure of the BRCT domain of human BRCA1 at 2.5 Å resolution. The domain contains two BRCT repeats that adopt similar structures and are packed together in a head-to-tail arrangement. Cancer-causing missense mutations occur at the interface between the two repeats and destabilize the structure. The manner by which the two BRCT repeats interact in BRCA1 may represent a general mode of interaction between homologous domains within proteins that interact to regulate the cellular response to DNA damage. The structure provides a basis to predict the structural consequences of uncharacterized BRCA1 mutations.
The C-terminal region in BRCA1 is essential to its tumor suppressor activity, as revealed by missense and truncation mutations within this region that lead to early onset breast cancer1,
2,
3,
4. This region contains two 90−100 amino acid sequence repeats, called BRCT (BRCA1 C-terminal) repeats5,
6, that bear weak amino acid sequence similarity to other proteins involved in DNA repair, such as the yeast protein RAD9 and the mammalian protein XRCC1, as well as the p53 binding protein, 53BP1. BRCT repeats are thought to serve as multipurpose protein−protein interaction modules, binding to other BRCT repeats or other protein domains with apparently unrelated structures. A large body of evidence suggests that the BRCT region of BRCA1 interacts with proteins involved in transcriptional control or DNA repair, including the transcriptional corepressor CtIP, histone deacetylases, p53, p300 and the DNA damage-associated helicase BACH1 (refs 7,8). Here we describe the crystal structure of the dual-repeat BRCT domain of BRCA1. Our work reveals that the two BRCT repeats pack together in a manner that is essential for the tumor suppressor function of BRCA1. We predict that a similar packing arrangement exists in other DNA repair proteins that contain tandem BRCT repeats.
Domain mapping and structure determination We first used limited proteolytic mapping to locate a folded protein domain within a purified C-terminal fragment of human BRCA1 (residues 1,528−1,863). Both trypsin and elastase rapidly digest BRCA1(1,528−1,863) to a proteolytically stable fragment (Fig. 1). Electrospray mass spectrometry and N-terminal sequencing of the products of the trypsin digestion indicated that the major species had a molecular weight of 25,038 ( 5) Da with the N-terminal sequence 'VNKR', corresponding to BRCA1 residues 1,646−1,863. This fragment contains both BRCT repeats5,
6, suggesting that the two BRCT repeats and the associated linker together form a stable structural unit. Deletion of residues 1,860−1,863, which are not conserved in the other mammalian homologs of BRCA1, yielded a highly soluble fragment, BRCA1(1,646−1,859). We crystallized and determined the structure of BRCA1(1,646−1,859) at 2.5 Å resolution using the multiwavelength anomalous dispersion method (MAD)9 with a selenomethionine (SeMet) substituted BRCT derivative (Fig. 2a−c; Table 1).
Figure 1. The two BRCT repeats form a single protein domain.
BRCA1(1,528−1,863) was digested with either trypsin or elastase for the times indicated, and the products were analyzed by SDS-PAGE. The open arrows indicate BRCA1(1,528−1,863), and the closed arrows indicate the proteolytically resistant fragments.
Figure 2. The structure of the dual repeat BRCT domain of BRCA1.
a, A ribbons representation of the BRCT domain. The secondary structure elements in the C-terminal BRCT repeat are labeled 'prime' to distinguish them from the corresponding secondary structure elements in the N-terminal repeat. b, C backbone trace of the BRCA1 BRCT domain. The N-terminal BRCT repeat is colored turquoise; the C-terminal repeat, gold; and the inter-repeat linker, gray. The view is rotated 90° clockwise from the view shown in (a). c, MAD-phased electron density at 2.9 Å resolution and contoured at 1.0 is displayed for the inter-BRCT repeat interface. d, A stereo view of a structural alignment of the N- and C-terminal BRCA1 BRCT repeats and the C-terminal BRCT repeat from XRCC1 (ref. 10). Least squares alignments were produced using O23.
Overall structure The dual-repeat BRCT domain of BRCA1 adopts an elongated structure: 70 Å long and 30−35 Å in diameter. Each of the two BRCT repeats adopts a structure similar to that observed in the isolated C-terminal BRCT repeat from the DNA repair protein XRCC1(ref. 10), as well as the single BRCT repeat in an NAD+-dependent DNA ligase11. The BRCT fold is characterized by a central, parallel four-stranded -sheet, with a pair of -helices (1 and 3) packed against one face and a single -helix (2) packed against the opposite face of the sheet (Fig. 2a,b). A structural alignment of the two BRCA1 BRCT repeats with the XRCC1 repeat (Fig. 2d) reveals that the relative arrangement of 1, 3 and the central -sheet is conserved in all three repeats. However, the conformations of the 1−1, 2−3 and 3−2 connecting loops, as well as the orientation of 2 relative to the central -sheet, are much less conserved. The conservation of the 1−3−-sheet structure is maintained by the packing of a limited number of key conserved hydrophobic residues in the core of the BRCT fold10.
BRCT repeat interactions The two BRCT repeats in BRCA1 interact in a head-to-tail fashion, burying 1,600 Å2 of hydrophobic, solvent accessible surface area in the interface (Fig. 3a−c). The core of this interface is formed by the interaction of three -helices: 2 of the N-terminal repeat, and 1' and 3' from the C-terminal repeat. The residues in these helices that contribute to the inter-repeat interface are almost all hydrophobic and pack tightly in a knobs-in-holes manner. The 23-amino acid linker connecting the two BRCT repeats is poorly defined in the electron density, possibly indicating flexibility. The central portion of the linker, however, is clearly helical (L) and packs against the C-terminal base of the 2-1'-3' helical bundle. The only salt bridge across the interface occurs between Arg 1699, immediately N-terminal to 2, and a pair of acidic residues, Glu 1836 and Asp 1840, exposed on the surface of 3'.
a, Stereo view of the interaction of three helices to form the core of the BRCT repeat interface. b, An electrostatic surface representation of the C-terminal BRCT repeat is displayed with a worm representation of 2 from the N-terminal repeat. c, An electrostatic surface representation of the N-terminal repeat is shown with a worm representation of 1' and 3' from the C-terminal repeat. In (a−c), the N-terminal repeat is colored turquoise; the C-terminal repeat, gold; and residues that cause cancer when mutated, red. d, An amino acid sequence alignment of the regions of BRCA1, 53BP1 and RAD9 that are predicted to form BRCT−BRCT interfaces. Residues that constitute this interface in BRCA1, as well as conserved residues in h53BP1 and S. cerevisiae RAD9, are colored green. Residues where cancer-causing missense mutations have been identified are boxed in red.
Multiple, tandem BRCT repeats are common in many of the BRCT-containing proteins such as 53BP1, RAD9, RAD4, DNA ligase IV, XRCC1 and the topoisomerase II binding protein TOPBP1. An amino acid sequence alignment of the two BRCA1 BRCT repeats with those in 53BP1 and RAD9 reveals that the residues occupying the interface between the two repeats in 2, 1' and 3' of BRCA1 are highly conserved in 53BP1 and RAD9 (Fig. 3d). This observation strongly suggests that the two BRCT repeats in 53BP1 and RAD9 also pack via a triple helical interface similar to that seen in BRCA1. Sequence alignments in the 2 regions of the other BRCT-containing proteins are less reliable because of the low level of sequence similarity between these proteins in this region. Nevertheless, many of these proteins show significant conservation of interface residues in 1' and 3', suggesting that this mode of packing could be common within the BRCT family. In proteins with more than two BRCT repeats, such as RAD4, the DNA polymerase II subunit Dpb11 and TOPBP1, this kind of packing could result in long, rod-like protein structures whose surfaces would consist of the BRCT repeat loops and the highly variable inter-repeat linker peptides. Such elongated structures could provide a scaffold for the regulated assembly of multiprotein complexes.
Individual BRCT repeats have also been found to interact in trans with the BRCT repeats of other proteins. For example, the C-terminal BRCT repeat in the DNA repair protein XRCC1 has been shown to interact with the DNA ligase III BRCT repeat12. Although such interactions may occur between 2 of one protein and the 1/3 face of the other, recent evidence suggests that XRCC1−DNA ligase III BRCT interactions involve residues exposed on the surface of 1 in both proteins13. Structural studies of other BRCT−BRCT complexes, as well as an analysis of the effects of surface exposed mutations on the ability of BRCT proteins to associate and carry out their function in vivo, will be required to further test structural models of heteromeric BRCT−BRCT interactions.
BRCT mutations The structure of the BRCT region of BRCA1 provides a powerful tool to interpret the large database of mutations in this domain that have been found in breast and ovarian cancer patients. For example, a nonsense mutation at Tyr 1853 deletes the last 11 amino acids of the second BRCA1 BRCT domain and is associated with early-onset breast cancer3. The peptide corresponding to the deleted residues normally adopts an extended conformation, which packs against 2' and the -sheet of the C-terminal BRCT repeat. Three hydrophobic residues within the deleted region, Tyr 1853, Leu 1854 and Ile 1855, are conserved in other BRCT repeats and are packed in the hydrophobic core of the C-terminal BRCA1 BRCT repeat. We predict that the deletion of these residues should destabilize the protein fold.
Two missense mutations within the BRCT region, A1708E and M1775R, are linked to breast and ovarian cancer1,
2. These mutations cripple the DNA double-strand break repair function of BRCA1 in human cells14; block the ability of the BRCT domain to interact with CtIP15,
16, histone deacetylases17 and BACH1 (ref. 7); and interfere with the role of the BRCT domain in activating transcription18. Ala 1708 and Met 1775 are part of the hydrophobic contact surface between the two BRCT repeats (Fig. 3a−c). Ala 1708 in 2 packs into a small hydrophobic pocket formed by 1' and 3' near the center of the interface, which would not be expected to accommodate the larger, negatively charged Glu at position 1708. Met 1775 is packed within a predominantly hydrophobic pocket near the edge of the inter-repeat interface. Substitution of this residue with an Arg could be sterically accommodated but would position the positive charge near another basic residue, Arg 1835. Therefore, we predict that these mutations destabilize the hydrophobic inter-repeat interface and could lead to a repositioning of the repeats relative to one another or, potentially, to the complete unfolding of the structure.
To directly test the structural consequences of these mutations, we assayed the sensitivity of dual-repeat BRCT domain proteins that harbor these mutations to proteolytic degradation (Fig. 4a). Both wild type and mutant proteins were produced by in vitro transcription/translation because neither the mutant bearing a nonsense mutation at Tyr 1853 (1853-ter) nor the A1708E missense mutant could be expressed in soluble form in E. coli. The wild type protein is highly resistant to digestion by trypsin, elastase or chymotrypsin, indicating that the in vitro produced BRCT protein is stably folded. In contrast, the Tyr 1853 truncation mutation and A1708E mutants are almost completely degraded by low concentrations of all enzymes, indicating that these proteins are not stably folded under the assay conditions. The M1775R mutant displays an intermediate sensitivity to proteolytic degradation, suggesting that the M1775R mutant exhibits a subtler structural defect.
Figure 4. Analysis of the structural consequences of mutations in the BRCT domain.
a, 35S-Met-labeled wild type BRCA1(1,646−1,859) and variants harboring the indicated mutations were digested with the indicated proteases, and the reaction products were analyzed by SDS-PAGE and autoradiography. Reactions were carried out at elastase concentrations of 0, 3, 30 and 300 g ml-1 (lanes 1−4); trypsin concentrations of 0, 6, 60 and 600 g ml-1 (lanes 5−8); and chymotrypsin concentrations of 0, 6, 60 and 600 g ml-1 (lanes 9−12). b, Missense mutations in the human BRCA1 BRCT domain. Missense mutations derived from the BIC19 database are indicated below the BRCT amino acid sequence. The Tyr 1853 truncation mutation is indicated with an X. Mutations with predicted deleterious effects on folding are colored red. Mutations known to cause cancer are boxed. Residues in green are involved in the inter-BRCT repeat interface.
The cancer risks associated with the vast majority of BRCA1 missense mutations deposited in the BIC database19 are unknown. Using our structure, we predict that several of these mutations will seriously impair the folding of the BRCT domain and will, therefore, lead to an elevated cancer risk (Fig. 4b). Such mutations include nonconservative substitutions of key hydrophobic residues packed in the protein core or mutations that disrupt electrostatic interactions. Many mutations remain unclassified, such as mutations in the hydrophobic core that replace one hydrophobic residue with another of significantly smaller or larger size. Although such mutations can be accommodated in highly stable proteins such as lysozyme through a subtle repacking of the hydrophobic core20, similar mutations may be more detrimental in the less stable BRCA1 BRCT domain. Finally, many mutations occur on the surface of the structure and are not predicted to alter the fold of the BRCT domain but may nevertheless perturb a binding site for an important BRCT partner. The analysis of the biochemical and in vivo effects of such surface mutations could provide strong evidence for the involvement of specific BRCT-interacting proteins in BRCA1 function and tumor suppression.
Methods BRCT expression and purification. Human BRCA1(1,528−1,863), used in the initial domain mapping experiments, was expressed and purified as a GST-fusion protein by glutathione-affinity chromatography. BRCA1(1,528−1,863) was then cleaved from GST using Prescission protease (Amersham-Pharmacia), and the C-terminal BRCA1 polypeptide was purified from GST by anion exchange chromatography.
Human BRCA1(1,646−1,859), used for crystallization, was expressed as an untagged recombinant protein in E. coli strain BL21(DE3). Purification was achieved using a combination of ammonium sulfate precipitation, hydrophobic interaction, anion exchange and gel filtration chromatography.
Proteolytic mapping of the BRCT domain. Purified BRCA1(1,528−1,863) at 400 g ml-1 was digested with either elastase (50 g ml-1) or trypsin (2 g ml-1) for the indicated times (Fig. 1). The reactions were terminated with phenylmethylsulfonyl flouride (PMSF), and the reaction products were separated by SDS-PAGE and stained with Coomassie blue.
To assay the proteolytic sensitivity of BRCT domain mutants, wild type BRCA1 BRCT (1,646−1,859), as well as A1708E, M1775R and the Tyr 1853 mutated to a stop codon were expressed by in vitro transcription/translation (TNTQuick, Promega) and labeled with 35S-methionine. The transcription/translation reactions were centrifuged to remove insoluble material. From each reaction, 3 l of the supernatant was then digested with either trypsin, elastase or chymotrypsin for 10 min at 25 °C in a final reaction volume of 15 l. Reaction products were visualized by SDS-PAGE and autoradiography.
Crystallization. Crystals were grown at 20−23 °C using the hanging drop vapor diffusion technique. Crystals of SeMet-substituted BRCA1(1,646−1,859) were grown by mixing 3 l of 20 mg ml-1 BRCT domain in protein solution (400 mM NaCl and 5 mM Tris-HCl, pH 7.5) with 3 l of well solution 1 (1.5 M (NH4)2SO4, 100 mM MES, pH 6.7, and 10 mM CoCl2) (Table 1). Native crystals were grown by mixing 3 l 18 mg ml-1 BRCA1(1,646−1,859) in protein solution with 3 l well solution 2 (0.8 M LiSO4, 100 mM Tris-HCl, pH 8.5, 2.5 mM NiCl2 and 5 mM CaCl2).
Data collection and processing. For data collection at 100 K, crystals were flash frozen in liquid nitrogen after gradual transfer to a cryoprotectant solution comprised of the respective well solution supplemented with 26% (v/v) glycerol. MAD and native data were obtained at beamlines 14BM-D and 14BM-C, respectively, at the Advanced Photon Source (APS, BioCARS). All data were scaled and reduced with the HKL package21.
Phasing, model building and refinement. Of the nine selenium positions, eight were located and refined using SOLVE9, and crystallographic phases were improved by solvent flattening and histogram matching implemented in DM22. The majority of the polypeptide chain was built with O23 using the solvent-flattened, MAD-phased electron density map at 2.9 Å resolution. The initial model was first refined against the remote wavelength (3) MAD dataset to 2.9 Å resolution in CNS24 using torsion angle molecular dynamics (MLHL target). Maximum likelihood targets, bulk solvent correction and overall anisotropic B-factor scaling were applied throughout the refinement process. Further refinement against the native data to 2.5 Å resolution involved iterative cycles of manual building and restrained refinement with TLS group anisotropic thermal parameter modeling as implemented in REFMAC (v5.0.32)25,
26. Anomalous difference Fourier synthesis, using the native data and phases calculated from the final model, confirmed the positions of 10 out of 13 sulfur atoms. The three missing sulfur atoms correspond to somewhat poorly ordered Met residues, Met 1650, Met 1728 and Met 1827, whose positions were determined from the MAD data.
An additional 15 peak was found in the anomalous difference map positioned between two His residues within a crystal pack. We have modeled a nickel atom at this position; however, Ca2+ is also present in the crystallization solution and might also bind at this site. This accounts for the absolute requirement for divalent metals in the crystallization solution. Poor electron density was observed for much of the inter-repeat linker and the 3−2 and 3'−2' loops, suggesting that these regions are relatively flexible. As a result, linker residues 1,743−1,747 have been modeled as polyalanine, whereas residue 1,694 from the 3−2 loop and residues 1817−1819 from the 3'−2' loop have been omitted from the final model. Analysis of stereochemistry by PROCHECK27 indicates that the model contains 82.8% of the residues in the most favorable regions of the Ramachandran plot, with no residues in the disallowed regions. Refinement statistics are provided in Table 1. Figures were created with BOBSCRIPT28 and rendered with RASTER3D29 (Fig. 3b,c) or POVRAY30 (Figs 2a−d, 3a).
Coordinates. The atomic coordinates and structure factors have been submitted to the Protein Data Bank (accession code 1JNX).
Acknowledgments We would like to thank L. Gaudreau for the gift of the BRCA1(1,568−1,863) expression plasmid and helpful discussions, B. Mark and J. Lamoureux for help with X-ray data collection, and K. Brister and the staff of APS BioCARS for excellent technical support during synchrotron data collection. This work was supported by operating grants from the Canadian Institutes of Health Research, the Canadian Breast Cancer Research Initiative and the Alberta Heritage Foundation for Medical Research.