The homing endonuclease I-CreI uses three metals, one of which is shared between the two active sites
Brett S. Chevalier1, Raymond J. Monnat Jr.2
& Barry L. Stoddard1
1 Fred Hutchinson Cancer Research Center the Graduate Program in Molecular and Cell Biology, University of Washington 1100 Fairview Ave. N. A3-023 Seattle, Washington 98109 USA.
2 University of Washington, Department of Pathology, Box 357705 Seattle, Washington 98195 USA.
Homing endonucleases, like restriction enzymes, cleave double-stranded DNA at specific target sites. The cleavage mechanism(s) utilized by LAGLIDADG endonucleases have been difficult to elucidate; their active sites are divergent, and only one low resolution cocrystal structure has been determined. Here we report two high resolution structures of the dimeric I-CreI homing endonuclease bound to DNA: a substrate complex with calcium and a product complex with magnesium. The bound metals in both complexes are verified by manganese anomalous difference maps. The active sites are positioned close together to facilitate cleavage across the DNA minor groove; each contains one metal ion bound between a conserved aspartate (Asp 20) and a single scissile phosphate. A third metal ion bridges the two active sites. This divalent cation is bound between aspartate residues from the active site of each subunit and is in simultaneous contact with the scissile phosphates of both DNA strands. A metal-bound water molecule acts as the nucleophile and is part of an extensive network of ordered water molecules that are positioned by enzyme side chains. These structures illustrate a unique variant of a two-metal endonuclease mechanism is employed by the highly divergent LAGLIDADG enzyme family.
The homing endonucleases are a diverse family of proteins encoded by open reading frames in genetically mobile introns or inteins1,
2,
3,
4. They have been identified in unicellular eukaryotes, Archaea and eubacteria1,
2,
3,
4. These proteins cleave long DNA target sites (14−40 bp) in homologous alleles that lack the intron or intein. The cleavage event initiates the transfer of the mobile sequence to these sites by a targeted transposition mechanism termed 'homing'. At least four homing endonuclease families have been identified on the basis of conserved sequence motifs that provide residues critical for enzyme folding and catalysis1,
2. The LAGLIDADG family is the largest of these families with over 200 known members, each containing one or two copies of a motif that resembles the consensus 'LAGLIDADG' sequence2,
5.
Enzymes that contain a single copy of the LAGLIDADG motif for example, I-CreI and I-CeuI are homodimers of 15−20 kDa per subunit and recognize pseudo palindromic homing sites6,
7,
8,
9. The LAGLIDADG motif, located near the N-terminus of each monomer, forms an -helix and packs against its counterpart to form a dimer interface10. The last, strictly conserved acidic residue of each motif is found in the active site of each subunit. Enzymes with two copies of the LAGLIDADG motif for example, I-DmoI and the endonuclease component of the PI-SceI intein are roughly twice as large as an I-CreI subunit and act as monomers. These members use the LAGLIDADG motif as an intramolecular domain interface11,
12,
13. The monomeric LAGLIDADG enzymes usually cleave less symmetric DNA target sites than I-CreI14,
15,
16,
17,
18,
19. Both the homodimeric and monomeric LAGLIDADG endonucleases generate 4-nucleotide, 3'-extended cohesive ends. Crystal structures have been determined for the intron-encoded I-CreI20 and I-DmoI11 endonucleases and for the intein-associated PI-SceI13 and PI-PfuI12 endonucleases. A crystal structure of a bound complex with DNA has only been reported for I-CreI10 to 3 Å resolution. Despite low sequence homology outside the LAGLIDADG motif, the topologies of these enzymes are quite similar. However, with the exception of the C-terminal acidic residue of each LAGLIDADG motif (underlined and in bold), their active site residues are divergent. This has made it difficult to assign roles for individual active site residues on the basis of structural conservation alone. The study reported here directly illustrates the structural mechanism of DNA cleavage for the I-CreI homing endonuclease.
Overall structure of the complex The native homing-site DNA is bound in a relatively unperturbed B-form conformation and exhibits a curvature around the enzyme that reduces the distance between minor groove scissile phosphates to 8 Å. In the presence of calcium, the DNA is uncleaved (Fig. 1a), whereas in the presence of magnesium or manganese the DNA is completely cleaved between bases 2 and 3 (Fig. 1b). The only significant conformational change upon cleavage is a movement of 1.5 Å by the liberated 5' phosphate group away from the 3' leaving group. The pseudo symmetric homing-site DNA construct is present in both orientations relative to the enzyme homodimer in the crystals (see Methods). We therefore modeled and refined the I-CreI structures with a mixture of these two DNA orientations. The cleavage state of the DNA, the conformation of the DNA backbone, the position of the cleaved DNA ends and the conformation of protein side chains in the protein−DNA interface are isomorphous between the two DNA orientations and between the separate DNA strands.
a, The substrate complex. b, The cleaved product complex. The electron density corresponds to omit Fo − Fc electron density maps contoured over all residues and nucleotides near the scissile phosphates. Divalent cations are purple, the nucleophilic water in the substrate complex in (a) is blue, and water molecules are green. All atoms modeled in this figure were omitted from the phase calculation for maps. The structure of the substrate complex was determined in the presence of calcium; the scissile phosphodiester bond is intact (black arrow). The structure of the cleaved product complex was determined in the presence of magnesium; the scissile phosphodiester bond is fully cleaved and the 5' phosphate is rotated away from the adjoining ribose sugar. The red density corresponds to the strongest features of an anomalous difference map calculated from data collected from crystals grown in the presence of manganese (contoured at 8 ). The central shared metal ion (# 2) bridges the enzyme active sites and contacts both scissile phosphates and both LAGLIDADG Asp 20 residues (Asp 20 and 20'). The two flanking metal ions (1 and 1') are each bound by individual active sites and contact a conserved aspartate residue and scissile phosphate groups. The density at base 2 of each half-site corresponds to an equal mixture of adenine and cytosine because the pseudo symmetric DNA is bound in both orientations relative to the I-CreI homodimer (see text). Both refined orientations are shown. c, The sequence of the native homing site is shown for reference: bases conserved between homing half-sites are boxed, and blue dots beneath indicate the nucleotide bases shown in the panels. Figs 1−3 made using program RIBBONS31.
Positions of bound metals and nucleophilic waters The two active sites are positioned close to one another with a spacing that matches the separation between the scissile phosphates (Fig. 2). The aspartate residues in the C-terminal ends of the LAGLIDADG helices (Asp 20) are separated by and very close to the two-fold dimer symmetry axis. A total of three divalent cations are bound by the enzyme−DNA complex. The binding site of each metal has been verified by anomalous difference maps using crystals grown in the presence of manganese (Fig. 1b). Each individual active site binds one divalent cation that is coordinated by an octahedral arrangement of six ligands. In the uncleaved complex (Figs 2a, 3a, 4a), calcium is bound by one oxygen atom from the conserved aspartate residue (Asp 20), the main chain carboxyl oxygen of Gly 19, a nonbridging oxygen from the scissile phosphate (between nucleotide bases +2 and +3), a second nonbridging oxygen from a phosphate group on the opposite DNA strand (between bases -1 and -2) and two water molecules. One of these water molecules (number 24 in Fig. 4a) is located 3 Å from the scissile phosphate and is well-positioned for in-line attack and hydrolysis of this phosphate (angle from the water nucleophile through the scissile phosphate to the 3' oxygen leaving group). In the cleaved product complex with magnesium (Figs 2b, 3b, 4b), the ligands coordinating the metal are identical with the exception that the nucleophilic water is now a covalently bound oxygen atom on the free 5' phosphate group; this oxygen atom is still directly bound to the metal ion.
a, The substrate complex. b, The cleaved product complex. Stereo view is shown looking directly into the minor groove at the cleavage site. Only phosphate groups and DNA sugars of the DNA are shown for clarity. The scissile phosphodiester bond is shown in (a) as a blue P−O bond segment; the direction of nucleophilic water attack on the corresponding red phosphate group is shown by a black arrow. The coloring is the same as in Fig. 1.
Figure 3. Stereo side views of the DNA−protein interface.
a, The substrate complex. b, The cleaved product complex. Only one DNA strand and one active site are shown for clarity. The residue and metal numbers and labels are the same as in Figs 1 and 2. Note the network of water molecules (green spheres) surrounding the scissile phosphate, the nucleophilic water and the 3' oxygen leaving group, and the movement of the cleaved 5' phosphate in the product complex in (b).
Figure 4. Schematic of the active site interactions and solvent network.
a, The substrate complex. b, The cleaved product complex. The residue numbers and labels are the same as in Figs 1−3. Water molecules are numbered consistently with the corresponding PDB files and are shown in blue with the exception of the proposed nucleophilie water molecule (orange 24) in the top panel. Bond distances are given in Å. All direct contacts between the bound metal ions and protein side chains and water molecules and the scissile phosphate are shown by thin green bonds; the contacts to phosphate atoms on the opposite DNA strand are omitted for clarity. The metal ions display octahedral coordination, with ligand bond distances ranging from 2.0 to 2.1 Å for magnesium and 2.4 to 2.5 Å for calcium.
A third metal ion (metal ion 2 in Figs 1,2,3) is bound at the interface of the enzyme subunits and is octahedrally coordinated by three identical ligands from each active site. In the uncleaved complex, the metal ion contacts an oxygen atom from each of the two conserved aspartate residues of the LAGLIDADG motif, the 3' bridging oxygen of the scissile phosphate on each DNA strand and a nonbridging oxygen from the same scissile phosphates. In the cleaved complex, this coordination is maintained, with the exception that the 3' oxygen is now a hydroxyl group on each cleaved DNA strand.
The metal ions in each individual active site (metal ions 1 and 1') appear to stabilize the phosphoanion transition state during cleavage and to position an activated water molecule for nucleophilic attack. The metal ion bound between the active sites (metal ion 2) simultaneously contacts both scissile phosphates and appears to stabilize the 3' oxygen leaving group on both DNA strands. Therefore, strand scission occurs by a two-metal hydrolytic mechanism, but with an essential bound metal ion shared between the two active sites. To the best of our knowledge, this is the first example of a shared metal ion among active sites for a nuclease or for any enzyme carrying out metal-dependent phosphoryl transfer. This finding should have implications for the kinetic mechanism of double strand cleavage. The hydrolysis of individual phosphates may be more concerted than for other endonuclease families because it is possible that all metal sites must be occupied in the complex before cleavage of either strand can occur. This result may also explain why it has been difficult to create enzymes with 'nicking' activity, in which only one strand of DNA is cut, by mutating individual active site aspartate residues in monomeric LAGLIDADG endonucleases21.
Active site residues and an extensive catalytic solvent network In addition to the utilization of a shared metal ion by both active sites, the positions and roles of solvent molecules and additional catalytic side chains in I-CreI are unusual. Apart from the conserved aspartic acids, three residues in the I-CreI active site have been identified as particularly important for catalysis: Lys 98, Arg 51 and Gln 47 (ref. 22). The counterparts of these residues are also important in the LAGLIDADG homing endonuclease I-CeuI7. These residues primarily interact with a network of solvent molecules that surround the nucleophilic water molecule (Fig. 4) and extend around the scissile phosphate to the 3' oxygen leaving group. This network includes a water molecule (number 4 in Fig. 4a) that is positioned near the 3' leaving group. This water molecule is not directly coordinated to a metal ion and, therefore, is not likely to be an ideal proton donor. However, because the 3' oxyanion leaving group of the scissile phosphate directly interacts with a metal ion, departure of this oxygen and its eventual protonation is still favorable. I-CreI appears to utilize a mechanism for DNA strand cleavage in which scissile phosphates contact two divalent cations while being extensively hydrated by several water molecules. This hydration shell is in turn structured and polarized by interactions with several basic side chains.
In most type II restriction endonucleases, individual residues directly contact the scissile phosphate oxygens, bind metal ions and help activate the water nucleophile through direct polar contacts23. These residues tend to be conserved among large numbers of restriction enzymes and are usually part of the (PD...(D/E)XK/R) signature for these enzymes, where D/E and K/R represent either of two residues at a single position, X is any residue at a single position and '...' represents a variable number of residues after the signature of these enzymes. Additional residues in type II endonucleases that are peripheral to the site of cleavage and often interact with solvent molecules near the DNA substrate are also catalytically important and result in diminished catalytic efficiency when mutated. These residues tend to be less strictly conserved across the entire enzyme superfamily. The majority of important active site residues in the I-CreI endonuclease and, presumably, in the LAGLIDADG family appear to resemble these latter type II residues in their catalytic role and in their degree of structural conservation. Apart from the C-terminal acidic residues in the LAGLIDADG motif, none of the active site residues that play a role in I-CreI catalysis are strictly conserved throughout the LAGLIDADG family1,
5. For example, Lys 98 and 98' in I-CreI have counterparts in the monomeric PI-SceI intein−endonuclease as Lys 301 and Lys 403 (ref. 24) In contrast, a Lys 98 equivalent is present in only one domain of I-DmoI (as Lys 120) (ref. 11) or of PI-PfuI (as Lys 322) (ref. 12). Similarly, I-CreI residues Gln 47 and Arg 51 do not have consistent counterparts in these other LAGLIDADG homing endonucleases.
The catalytic mechanism we have proposed for I-CreI can be generalized to other members of the LAGLIDADG family and may explain some of the divergence of active site residues. The use of basic side chains to position and polarize a network of waters surrounding the scissile phosphate groups might allow a broader range of substitutions than if these residues were used for direct, specific contacts to either the substrate or the transition state during catalysis. This active site structure and catalytic mechanism may have been further diversified by the independent and separate fusion of ancestral homodimeric LAGLIDADG endonucleases1.
Methods Crystallization. The I-CreI endonuclease was overexpressed and purified as previously described25 with the exception that the induction and subsequent cell growth was conducted at 15 °C to facilitate efficient overexpression. The DNA was purchased from Oligos Etc. and consisted of two strands of sequence: 5'-GCAAAACGTCGTGAGACAGTTTCG-3' and its complement 5'-CGAAACTGTCTCACGACGTTTTGC-3'. The construct forms a 24-bp blunt-ended pseudo palindromic duplex that differs at 4 positions between the two half-sites (Fig. 1). Crystals were grown using a 2.7:1 molar ratio solution of DNA:protein by hanging drop vapor diffusion against a reservoir containing 20 mM NaCl, 100 mM MES pH 6.3−6. and PEG 400 (v/v) 20−35%. The final concentration of I-CreI in the DNA:protein complex solution was 3.5 mg ml-1. The crystallization drops also contained 10 mM of divalent cations that either inhibited (CaCl2) or promoted (MgCl2) cleavage, or permitted cleavage while allowing verification of the metal binding sites by using anomalous difference Fourier maps (MnCl2). For all three separate experiments, the crystal belong to space group P21, with unit cell dimensions a = 43 Å, b = 68 Å and c = 88 Å (Table 1). Although the DNA constructs are identical, and the crystallization conditions similar to that used in previous crystallographic studies20, the resulting crystal form represent a new space group that diffracts to 2 Å resolution or higher.
Data collection. Crystals were transferred sequentially to aliquots of the crystallization reservoir with the concentration of PEG 400 increased to (v/v) 30 to 35%. Crystals were suspended in a fiber loop, frozen in liquid nitrogen and maintained at 100 K during data collection. Data corresponding to crystals grown in the presence of calcium or manganese were collected on an in-house Rigaku RAXIS IV area detector mounted on an RU200 rotating anode X-ray generator equipped with mirror focusing optics (Molecular Structure Corporation). Data from crystals grown in the presence of magnesium were collected at the Advanced Light Source (beamline 5.0.2). Data were reduced using the DENZO/SCALEPACK crystallographic data reduction package26 (Table 1).
Structure refinement. The structures of the complexes were solved by molecular replacement with the program EPMR27, using the low resolution model of the enzyme−DNA complex20 as a search probe. Subsequent maps were of excellent quality. The initial models were refined using CNS28 with 5% of the reflections set aside for Rfree29. For subsequent rebuilding, omit maps were calculated. The pseudo symmetric DNA is present in both possible orientations relative to the homodimeric endonuclease; this is readily apparent when omit maps are examined at each of the four basepairs in each half-site that violate palindromic symmetry within the DNA (positions 1, 2, 6 and 7). Refinement of the complexes with the DNA in single orientations rather than in both orientations causes an increase in Rfree from 25% to 27%. This result is confirmed by experiments in which an iododeoxyuracil base is incorporated into the end of one DNA strand. Difference Fourier maps of that complex also indicate a 50/50 mixture of two DNA orientations (B.L.S., unpublished data). The final refined model consists of residues 2−153 for each enzyme subunit, 12 base pairs of DNA for each DNA half-site (modeled and refined as a mixture of two DNA orientations), 3 metal ions per complex and 435 or 857 water molecules for the substrate and product complexes, respectively. For the crystals grown in the presence of manganese, anomalous difference Fourier maps unambiguously confirm the locations of bound metal ions. Geometric analysis of the structure using PROCHECK30 indicates that there are no residues with generously allowed or unfavorable backbone dihedral angles and that 89% of residues are in the core region of the Ramachandran plot.
Coordinates. Atomic coordinates of the structures have been deposited in the Protein Data Bank (accession codes 1G9Y
for the substrate complex and 1G92 for the product complexes).
Otwinowski, Z. & Minor, W. Methods Enzymol.276, 307−326 (1997). | Article | PubMed | ISI | ChemPort |
Kissinger, C.R. & Gehlhaar, D.K. EPMR: A program for crystallographic molecular replacement by evolutionary search. (Agouron Pharmaceuticals, La Jolla, CA; 1997).
Brunger, A. et al. Acta Crystallogr. D54, 905−921 (1998). | Article | PubMed | ISI |
Brunger, A. Acta Crystallogr. D47, 24−36 (1993). | Article |
Acknowledgments We acknowledge B. Shen, E. Galburt, C. Spiegel and R. Strong for advice during the crystallographic analysis, P. Rupert, P.-W. Li and A. Ferre-D'Amare for assistance collecting data, and the staff of ALS beamline 5.0.2 for technical assistance. This work was supported by the NIH for B.L.S. and R.J.M and by the National Cancer Institute for B.C.