DNA recognition

Reading the minor groove

Article metrics

The design of molecules that recognize specific sequences on the DNA double helix would provide new tools to control gene expression and a rational basis for fresh approaches to drug development. Parts of each base pair are exposed in the two distinct ‘grooves’ of DNA, the major and the minor grooves, and the sequence information of the DNA duplex is available for readout from both of them. On page 468 of this issue1, White et al. describe a molecular code for the recognition of the four Watson-Crick base pairs from the minor groove of DNA using hairpin polyamides containing imidazole, pyrrole and 3-hydroxypyrrole rings.

Double-helical DNA can bind different types of ligand which can be classified in two categories: intercalators which insert their aromatic ring between two adjacent base pairs, and groove binders which bind DNA within either groove of the double helix. Intercalators have a limited sequence specificity, because they are interacting only with two base pairs unless they are linked to a groove-recognition element — as exemplified by natural molecules, such as actinomycin; or by synthetic conjugates where an intercalator has been covalently tethered to a major groove ligand, such as an oligonucleotide2, or to a minor groove ligand such as distamycin3. In living organisms, the reading of sequence information on DNA involves binding of proteins to specific control regions of the genes. These proteins exploit all three modes of binding: intercalation or partial insertion of aromatic amino acids, binding of α-helices into the major groove, and binding of loops of polypeptide chains or β-sheets into the minor groove. But no general amino-acid/base-pair code is available yet.

As Fig. 1 shows, it should be possible to distinguish the four base pairs (G˙C, C˙G, A˙T and T˙A) from the major-groove side through hydrogen-bonding interactions. Oligonucleotides provide a partial solution to this problem; they recognize oligopurine˙oligopyrimidine sequences of double-helical DNA by forming triple-helical complexes4,5. But it remains a challenge to design nucleoside analogues that would allow oligonucleotides to recognize all four base pairs (and not only two of them)6.

Figure 1: Differences in DNA base-pair recognition between the major and minor grooves.

a, A˙T superimposed on T˙A and, b, G˙C superimposed on C˙G, so that the C1′ atoms of the glycosidic linkages coincide. The arrows indicate the positions where hydrogen bonds may form (but not their direction), and point to the O and N hydrogen-bond acceptors and away from the NH2hydrogen-bond donors. The hydrogen-bonding positions in the minor groove (lower edge of the base pairs) nearly coincide when the base pairs are reversed. In contrast, the hydrogen-bond donor and acceptor groups in the major groove (upper edge), as well as the methyl group of thymines (*), occupy different positions following base pair reversal. Major-groove ligands should be much better for discriminating between the four base pairs, but White et al.1 show that minor groove ligands can be designed successfully to achieve this end. (Adapted from ref. 16.)

Studies in the mid-1980s, on A˙T-specific minor-groove ligands, suggested that it should be possible to design ligands that could recognize G˙C base pairs by replacing the pyrrole (Py) rings of distamycin or netropsin by imidazole (Im) rings (the so-called lexitropsins)7,8. But this strategy had limited success in recognizing extended sequences. The breakthrough came with the discovery that 2:1 complexes could form with two distamycin molecules bound sideby-side to the minor groove of A+T-rich sequences9 — leading to the idea that covalent linkage of two such monomeric ligands would constitute a strong ligand for the minor groove at A+T-rich sequences10,11. Indeed, hairpin polyamides with Py/Py ‘pairs’ could recognize A˙T and T˙A base pairs12. Introducing imidazole instead of pyrrole rings provided a recognition element for guanine, with Im/Py pairs recognizing G˙C and Py/Im recognizing C˙G base pairs (the replacement of the C(3)H of pyrrole by N(3) allows for the formation of a hydrogen bond with the NH2group of guanine)13,14,15,16. The degeneracy of the recognition code for A˙T and T˙A remained, however.

Based on the observation of an asymmetrically placed cleft on the minor groove surface at A˙T base pairs, White et al.1 now report that replacing the C(3)H group of pyrrole by a bulkier C(3)OH group (in 3-hydroxypyrrole; Hp) enables hairpin polyamides to recognize all four base pairs with the following molecular code: Py/Im → C˙G; Im/Py → G˙C; Hp/Py → T˙A; and Py/Hp → A˙T. The A˙T/T˙A discrimination is, however, obtained at the expense of stability, but destabilization is higher when 3-hydroxypyrrole is on the A side rather than the T side. In the example described by White et al. (see Fig. 2), replacing the Py/Py pair of 1 by a Py/Hp pair as in 2 decreases binding to the sequence 5′-TGGTCA-3′ 191-fold and that to 5′-TGGACA-3′ by sixfold only. Conversely, when the Py/Py pair is replaced by a Hp/Py pair as in 3, the destabilization is reversed with 5′-TGGACA-3′ destabilized about 250-fold and 5′-TGGTCA-3′only sixfold. This result illustrates a common phenomenon in molecular recognition: specificity may be achieved by increased destabilization of interactions at unselected sequences, rather than by enhancing interactions at the specific sites.

Figure 2: Hairpin polyamides.

a, Hairpin polyamide binding to the minor groove of B-DNA, the helix being unrolled so that the base pairs are horizontal and the helix axis vertical. b, Chemical structures of the N-methyl derivatives of pyrrole (Py), imidazole (Im) and 3-hydroxypyrrole (Hp) used by White et al. to build the hairpin polyamides. c, Binding constants for the hairpin polyamides with pyrrole and or 3-hydroxypyrrole rings at positions and and the two sequences where a T˙A base pair is replaced by an A˙T base pair. The last column shows the ratio of the two binding constants for each of the / combinations.

White and colleagues' results do not provide evidence for the exact molecular basis of the observed discrimination. It would be surprising if the OH group of 3-hydroxypyrrole plays only a steric role. Hydrogen-bonding interactions might lead to different distortions at A˙T and T˙A base pairs. Replacement of H by OH at the C(3) position of pyrrole may also change the hydration of the hairpin polyamide, and differential hydrophobic effects might be involved in discrimination of A˙T from T˙A. The effect might also depend on the neighbouring base pairs. The substrate described by White et al. (GAC versus GTC) has G˙C and C˙G neighbours. The A˙T/T˙A discrimination is reported to be observed when neighbours are also A˙T base pairs, but the extent of discrimination might vary with the environment and the local polymorphism of DNA.

To achieve selective recognition of a single gene in human cells, a sequence of about 17 base pairs must be recognized6. This might be difficult with hairpin polyamides — the repeat distance between two consecutive units does not exactly match that between two consecutive base pairs, and flexible spacers have to be used to restore the register of the recognition elements17. But might recognition of fewer base pairs allow for gene-specific effects? Within the cell nucleus, DNA sequences are partly buried because of the nucleosomal organization of chromatin and the large number of proteins bound to the genetic material. If so, shorter recognition sequences might suffice, as most identical sequences might not be accessible. If the target sequences on the DNA double-helix are within gene regulatory regions then, given that they are accessible to regulatory proteins, the same should apply to ligands such as polyamides or oligonucleotides However, other sequences within the transcribed regions are also accessible, as shown for 15-mer oligonucleotides recognizing the major groove of DNA18. Further studies are required to tackle these issues.

Finally, we know little about the behaviour of hairpin polyamides in a cellular environment. Hairpin polyamides can inhibit gene-specific transcription in cell cultures at micromolar concentrations even though they bind naked DNA at subnanomolar concentrations19. We have no data yet on their uptake and compartmentalization in cells, and we don't know whether they bind to cellular components other than nucleic acids. All of these questions must be investigated for hairpin polyamides and related molecules before their promise as genespecific control agents in vivo is clearly established.


  1. 1

    White, S., Szewczyk, J. W., Turner, J. M., Baird, E. E. & Dervan, P. B. Nature 391, 468–471 (1998).

  2. 2

    Sun, J. S.et al. Proc. Natl Acad. Sci. USA 86, 9198–9202 (1989).

  3. 3

    Bailly, C.et al. Biochemistry 33, 15348–15364 (1994).

  4. 4

    Le Doan, T.et al. Nucleic Acids Res. 15, 7749–7760 (1987).

  5. 5

    Moser, H. E. & Dervan, P. B. Science 238, 645–650 (1987).

  6. 6

    Thuong, N. T. & Hélène, C. Angew. Chem. Int. Edn Engl. 32, 666–690 (1993).

  7. 7

    Kopka, M. L., Yoon, C., Goodsell, D. S., Pjura, P. & Dickerson, R. E. Proc. Natl Acad. Sci. USA 82, 1376–1380 (1985).

  8. 8

    Lown, J. W.et al. Biochemistry 25, 7408–7416 (1986).

  9. 9

    Pelton, J. G. & Wemmer, D. E. Proc. Natl Acad. Sci. USA 86, 5723–5727 (1989).

  10. 10

    Mrksich, M. & Dervan, P. B. J. Am. Chem. Soc. 115, 9892–9899 (1993).

  11. 11

    Chen, Y. H. & Lown, J. W. J. Am. Chem. Soc. 116, 6995–7005 (1994).

  12. 12

    Mrksich, M., Parks, M. E. & Dervan, P. B. J. Am. Chem. Soc. 115, 7983–7988 (1994).

  13. 13

    Geierstanger, B. H., Mrksich, M., Dervan, P. B. & Wemmer, D. E. Science 266, 646–649 (1994).

  14. 14

    Trauger, J. W., Baird, E. E. & Dervan, P. B. Nature 382, 559–561 (1996).

  15. 15

    Singh, M. P., Wylie, W. A. & Lown, J. W. Magn. Res. Chem. 34, F55-F66 (1996).

  16. 16

    Kopka, M. L.et al. Structure 5, 1033–1046 (1997).

  17. 17

    Kelly, J. J., Baird, E. E. & Dervan, P. B. Proc. Natl Acad. Sci. USA 93, 6981–6985 (1996).

  18. 18

    Giovannangeli, C.et al. Proc. Natl Acad. Sci. USA 94, 79–84 (1997).

  19. 19

    Gottesfeld, J. M., Neely, L., Trauger, J. W., Baird, E. E. & Dervan, P. B. Nature 387, 202–205 (1997).

Download references

Author information

Rights and permissions

Reprints and Permissions

About this article

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.