Key Points
-
Knowledge of the 3D structures of biological molecules can reveal the fine details of how the molecules perform their biological functions.
-
The principal experimental techniques for determining 3D structure are X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. In addition, recent advances in electron microscopy, cryo-electron microscopy and electron cryo-tomography have enabled low-resolution structures of large structures, such as those of viruses and large macromolecular assemblies, to be determined.
-
Homology modelling allows the 3D structures of proteins to be computed from their sequence, based on the structure of a close relative. However, great care must be taken when drawing conclusions on the basis of these models as they are often not sufficiently accurate for predicting detailed or subtle structural changes.
-
3D structure has contributed greatly to our understanding of protein evolution as structure tends to be more strongly conserved over evolutionary time than sequence. Structural similarity can reveal distant evolutionary relationships between proteins that cannot be detected from comparison of their sequences alone.
-
Most single amino-acid changes to a protein's sequence have little effect on structure. Some, however, such as the point mutations that are known to be associated with inherited monogenic diseases, have catastrophic consequences.
-
Most disease-associated missense mutations affect either the stability of the associated protein or its ability to fold. Others interfere with its biological function by disrupting its ability to interact with other molecules.
-
Sequence insertions and deletions can be accommodated over evolutionary time within a protein's structure without significantly altering its overall fold.
-
Alternative splicing of genes can result in different protein products with different functions. In some cases, the change of function can be explained by the consequent change in structure (for example, by the loss or gain of a functional structural domain, or by the subtle modification of the substrate binding site). In most cases, however, the structural differences between splice isoforms are not known (and are difficult to model reliably), so the consequences are difficult to predict.
-
The X-ray structure of the nucleosome core particle has revealed the large-scale packaging of DNA in chromatin. This, together with the knowledge of the histone proteins and the modifications they can undergo, is a first step in the understanding of the encoding, inheritance and recognition of epigenetic information.
-
The phenotypic differences observed both within and between species result both from differences in the genome sequences and in the regulatory networks through which the genes are expressed. Structural studies have helped to reveal how many of the simpler mechanisms that regulate gene expression operate.
-
Although there are currently 47,000 3D structures of biomolecules known, this is a tiny fraction of the 105 million known protein sequences. Several structural genomics projects worldwide are applying high-throughput technologies for structure determination in an attempt to address this shortfall.
Abstract
Detailed knowledge of the three-dimensional structures of biological molecules has had an enormous impact on all areas of biological science, including genetics, as structure can reveal the fine details of how molecules perform their biological functions. Here we consider how changes in protein sequence affect the corresponding 3D structure, and describe how structural information about proteins, DNA and chromatin has shed light on gene regulatory mechanisms and the storage and transmission of epigenetic information. Finally, we describe how structure determination is benefiting from the high-throughput technologies of the worldwide structural genomics projects.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Watson, J. D. & Crick, F. H. C. A structure for deoxyribose nucleic acid. Nature 171, 737–738 (1953).
Franklin, R. E. & Gosling, R. G. Molecular configuration in sodium thymonucleate. Nature 171, 740–741 (1953).
Kendrew, J. C. et al. Structure of myoglobin: a three-dimensional Fourier synthesis at 2Å resolution. Nature 185, 422–427 (1960).
Dutta, S. & Berman, H. M. Large macromolecular complexes in the Protein Data Bank: a status report. Structure 13, 381–388 (2005). An interesting Review of some of the large 3D structures that have been solved.
Adrian, M., Dubochet, J., Lepault, J. & McDowall, A. W. Cryo-electron microscopy of viruses. Nature 308, 32–36 (1984).
Grünewald, K. & Cyrklaff, M. Structure of complex viruses and virus-infected cells by electron cryo tomography. Curr. Opin. Microbiol. 9, 437–442 (2006).
Chothia, C. & Lesk, A. M. The relation between the divergence of sequence and structure in proteins. EMBO J. 5, 823–826 (1986). The first demonstration that protein 3D structure is better conserved than sequence is.
Rost, B. Twilight zone of protein sequence alignments. Protein Eng. 12, 85–94 (1999).
Pearl, F. M.G. et al. The CATH extended protein-family database: providing structural annotations for genome sequences. Prot. Sci. 11, 233–244 (2002).
Marsden, R. L. et al. Exploiting protein structure data to explore the evolution of protein function and biological complexity. Phil. Trans. Royal Soc. B-Biol. Sci. 361, 425–440 (2006).
Orengo, C. A., Sillitoe, I., Reeves, G. & Pearl, F. M. G. What can structural classifications reveal about protein evolution? J. Struct. Biol. 134, 145–165 (2001).
Lee, D., Grant, A., Buchan, D. & Orengo, C. A structural perspective on genome evolution. Curr. Opin. Struct. Biol. 13, 359–369 (2003).
Vogel, C., Bashton, M., Kerrison, N. D., Chothia, C. & Teichmann, S. A. Structure, function and evolution of multidomain proteins. Curr. Opin. Struct. Biol. 14, 208–216 (2004).
Ranea, J. A. G., Sillero, A. Thornton, J. M. & Orengo, C. A. Protein superfamily evolution and the last universal common ancestor (LUCA). J. Mol. Evol. 63, 513–525 (2006). These authors argue that the protein domains that are present across all kingdoms of life may have been present in the last universal common ancestor.
Martin, A. C. R. et al. Protein folds and functions. Structure 6, 875–884 (1998).
Matthews, B. W. Structural and genetic analysis of protein stability. Annu. Rev. Biochem. 62, 139–160 (1993).
Sinha, N. & Nussinov, R. Point mutations and sequence variability in proteins: redistributions of preexisting populations. Proc. Natl Acad. Sci. USA 98, 3139–3144 (2001).
Stenson, P. D. et al. Human Gene Mutation Database (HGMD): 2003 update. Hum. Mutat. 21, 577–581 (2003).
Wang, Z. & Moult, J. SNPs, protein structure, and disease. Hum. Mutat. 17, 263–270 (2001).
Steward, R. E., MacArthur, M.W, Laskowski, R. A. & Thornton, J. M. Molecular basis of inherited diseases: a structural perspctive. Trends Genet. 19, 505–513 (2003).
Ferrer-Costa, C., Orozco, M. & de la Cruz, X. Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. J. Mol. Biol. 315, 771–786 (2002).
Yue, P. & Moult, J. Identification and analysis of deleterious human SNPs. J. Mol. Biol. 356, 1263–1274 (2006).
Yue, P., Li, Z. & Moult, J. Loss of protein structure stability as a major causative factor in monogenic disease. J. Mol. Biol. 353, 459–473 (2005).
Chang, Y.-F., Imam, J. S. & Wilkinson, M. F. The nonsense-mediated decay RNA surveillance pathway. Annu. Rev. Biochem. 76, 51–74 (2007).
Vogt, G. et al. Gain-of-glycosylation mutations. Curr. Opin. Genet. Dev. 17, 245–251 (2007).
Ng, P.C & Henikoff, S. Predicting the effect of amino acid substitutions on protein function. Annu. Rev. Genom. Human Genet. 7, 61–80 (2006).
Saunders, C. T. & Baker, D. Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J. Mol. Biol. 322, 891–901 (2002).
Karchin, R. et al. LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics 21, 2814–2820 (2005).
Zvelebil, M. J., Barton, G. J., Taylor, W. R. & Sternberg, M. J. E. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195, 957–961 (1987).
Poussu, E., Vihinen, M., Paulin, L. & Savilahti, H. Probing the α-complementing domain of E. coli β-galactosidase with use of an insertional pentapeptide mutagenesis strategy based on Mu in vitro DNA transposition. Proteins 54, 681–692 (2004).
Chen, R. Enzyme engineering: rational redesign versus directed evolution. Trends Biotechnol. 19, 13–14 (2001).
Getzoff, E. D. et al. Faster superoxide-dismutase mutants designed by enhancing electrostatic guidance. Nature 358, 347–351 (1992).
Xing, Y. & Lee, C. Alternative splicing and RNA selection pressure — evolutionary consequences for eukaryotic genomes. Nature Rev. Genet. 7, 499–509 (2006).
Black, D. L. Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell 103, 367–370 (2000).
Johnson, J. M. et al. Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science 302, 2141–2144 (2003).
Stamm, S. et al. Function of alternative splicing. Gene 344, 1–20 (2005). A nice Review of splice variants that are known to be associated with alternative functions.
Schmucker, D. et al. Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell 101, 671–684 (2000).
Celetto, A. M. & Gravely, B. R. Alternative splicing of the Drosophila Dscam pre-mRNA is both temporally and spatially regulated. Genetics 159, 599–608 (2001).
Stetefeld, J. & Ruegg, M. A. Structural and functional diversity generated by alternative mRNA splicing. Trends Biochem. Sci. 30, 515–521 (2005).
Hymowitz, S. G. et al. The crystal structures of EDA-A1 and EDA-A2: splice variants with distinct receptor specificity. Structure 11, 1513–1520 (2003).
Walma, T. et al. A closed binding pocket and global destabilization modify the binding properties of an alternatively spliced form of the second PDZ domain of PTP-BL. Structure 12, 11–20 (2004).
Garcia, J., Gerber, S. H., Sugita, S., Sudhof, T. C. & Rizo, J. A conformational switch in the Piccolo C2A domain regulated by alternative splicing. Nature Struct. Mol. Biol. 11, 45–53 (2004).
Lewis, B. P., Green, R. E. & Brenner, S. E. Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc. Natl Acad. Sci. USA 100, 189–192 (2003).
Henikoff, S. et al. Gene families: the taxonomy of protein paralogs and chimeras. Science 278, 609–614 (1997).
Brodsky, G. et al. The human GARS–AIRS–GART gene encodes two proteins which are differentially expressed during human brain development and temporally overexpressed in cerebellum of individuals with Down syndrome. Hum. Mol. Genet. 6, 2043–2050 (1997).
Orban, T. I. & Olah, E. Emerging roles of BRCA1 alternative splicing. Mol. Pathol. 56, 191–197 (2003).
Kriventseva, E. V. et al. Increase of functional diversity by alternative splicing. Trends Genet. 19, 124–128 (2003).
Zavolan, M. & van Nimwegen, E. The types and prevalence of alternative splice forms. Curr. Opin. Struct. Biol. 16, 362–367 (2006).
Liu, M., Walch, H., Wu, S & Grigoriev, A. Significant expansion of exon-bordering protein domains during animal proteome evolution. Nucleic Acids Res. 33, 95–105 (2005). This paper demonstrates that protein domains with borders that coincide with exon borders seem to be more abundant than other domains, possibly as a result of exon shuffling and gene duplication.
Yura, K. et al. Alternative splicing in human transcriptome: functional and structural influence on proteins. Gene 380, 63–71 (2006).
Tress, M. L. et al. The implications of alternative splicing in the ENCODE protein complement. Proc. Natl Acad. Sci. USA 104, 5495–5500 (2007).
Romero, P. R. et al. Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proc. Natl Acad. Sci. USA 103, 8390–8395 (2006).
Wang, P., Yan, B., Guo, J.-T., Hicks, C. & Xu, Y. Structural genomics analysis of alternative splicing and application to isoform structure modeling. Proc. Natl Acad. Sci. USA 102, 18920–18925 (2005).
Burlingame, R. W. et al. Crystallographic structure of the octameric histone core of the nucleosome at a resolution of 3.3Å. Science 228, 546–553 (1985).
Klug, A. et al. Crystallographic structure of the octamer histone core of the nucleosome. Science 229, 1109–1113 (1985).
Arents, G., Burlingame, R. W., Wang, B. C., Love, W. E. & Moudrianakis, E, N. The nucleosomal core histone octamer at 3.1 Å resolution: a tripartite protein assembly and a left-handed superhelix. Proc. Natl Acad. Sci. USA 88, 10148–10152 (1991).
Luger, K., Mäder, A. W., Richmond, R. K., Sargent D. F. & Richmond, T. J. Crystal structure of the nucleosome core particle at 2.8Å resolution. Nature 389, 251–260 (1997).
Luger, K. Structure and dynamic behaviour of nucleosomes. Curr. Opin. Genet. Dev. 13, 127–135 (2003).
Bhaumik, S. R., Smith, E. & Shilatifard, A. Covalent modifications of histones during development and disease pathogenesis. Nature Struct. Mol. Biol. 14, 1008–1016 (2007).
Turner, B. M. Cellular memory and the histone code. Cell 111, 285–291 (2002).
Taverna, S. D., Li, H., Ruthenburg, A. J., Allis, C. D. & Patel, D. J. How chromatin-binding modules interpret histone modifications: lessons from professional pocket pickers. Nature Struct. Mol. Biol. 14, 1025–1040 (2007).
Latham, J. A. & Dent, S. Y. R. Cross-regulation of histone modifications. Nature Struct. Mol. Biol. 14, 1017–1024 (2007).
Rando, O. J. Global patterns of histone modifications. Curr. Opin. Genet. Dev. 17, 94–99 (2007).
Min, J., Zhang, Y. & Xu, R. M. Structural basis for specific binding of Polycomb chromodomain to histone H3 methylated at Lys 27. Genes Dev. 17, 1823–1828 (2003).
Jacobs, S. A. & Khorasanizadeh, S. Structure of HP1 chromodomain bound to a lysine 9-methylated histone H3 tail. Science 295, 2080–2083 (2002).
Suto, R. K., Clarkson, M. J., Tremethick, D. J. & Luger, K. Crystal structure of a nucleosome core particle containing the variant histone H2A. Z. Nature Struct. Biol. 7, 1121–1124 (2000).
Couture, J.-F. & Trievel, R. C. Histone-modifying enzymes: encrypting an enigmatic epigenetic code. Curr. Opin. Struct. Biol. 16, 753–760 (2006).
Holbert, M. A. & Marmorstein, R. Structure and activity of enzymes that remove histone modifications. Curr. Opin. Struct. Biol. 15, 673–680 (2005).
Schalch, T., Duda, S., Sargent, D. F. & Richmond, T. J. X-ray structure of a tetranucleosome and its implications for the chromatin fibre. Nature 436, 138–141 (2005).
Groth, A., Rocha, W., Verreault, A. & Almouzni, G. Chromatin challenges during DNA replication and repair. Cell 128, 721–733 (2007).
Cairns, B. R. Chromatin remodeling: insights and intrigue from single-molecule studies. Nature Struct. Mol. Biol. 14, 989–996 (2007).
Greenbaum, J. A., Parker, S. C. J. & Tullius, T. D. Detection of DNA structural motifs in functional genomic elements. Genome Res. 17, 940–946 (2007).
Kornberg, R. D. & Lorch, Y. Chromatin rules. Nature Struct. Mol. Biol. 14, 986–988 (2007). An overview for a special edition of Nature Struct. Mol. Biol. (dated Nov 2007) devoted entirely to the structural organization and function of chromatin.
Luscombe, N. M., Austin, S. E., Berman, H. M. & Thornton, J. M. An overview of the structures of protein–DNA complexes. Genome Biol. 1, 1 (2000).
Matthews, B. W. Protein–DNA interaction. No code for recognition. Nature 335, 294–295 (1988).
Luscombe, N. M., Laskowski, R. A. & Thornton, J. M. Amino acid–base pair interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res. 29, 2860–2874 (2001).
Benos, P. V., Lapedes, A. S. & Stormo, G. D. Is there a code for protein–DNA recognition? Probab(ilistical)ly. BioEssays 24, 466–475 (2002).
Marmorstein, R. & Fitzgerald, M. X. Modulation of DNA-binding domains for sequence-specific DNA recognition. Gene 304, 1–12 (2003).
Ambros, V. The functions of animal microRNAs. Nature 431, 350–355 (2004).
MacRae, I. J. et al. Structural basis for double-stranded RNA processing by Dicer. Science 311, 195–198 (2006).
Serganov, A. & Patel, D. J. Ribozymes, riboswitches and beyond: regulation of gene expression without proteins. Nature Rev. Genet. 8, 776–790 (2007).
Vitreschak, A. G., Rodionov, D. A., Mironov, A. A. & Gelfand, M. S. Riboswitches: the oldest mechanism for the regulation of gene expression? Trends Genet. 20, 44–50 (2004).
Lanctôt, C., Cheutin, T., Cremer, M., Cavalli, G. & Cremer, T. Dynamic genome architecture in the nuclear space: regulation of gene expression in three dimensions. Nature Rev. Genet. 8, 104–115 (2007).
Sexton, T., Schober, H., Fraser, P. & Gasser, S. M. Gene regulation through nuclear organization. Nature Struct. Mol. Biol. 14, 1049–1055 (2007).
Kulikova, T. et al. EMBL Nucleotide Sequence Database in 2006. Nucleic Acids Res. 35, D16–D20 (2007).
Brenner, S. E. A tour of structural genomics. Nature Rev. Genet. 2, 801–809 (2001).
Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 32, D138–D141 (2004).
Chandonia, J. M. & Brenner, S. E. The impact of structural genomics: Expectations and outcomes. Science 311, 347–351 (2006).
Watson, J. D. et al. Towards fully automated structure-based function prediction in structural genomics: a case study. J. Mol. Biol. 367, 1511–22 (2007).
Bernstein, F. C. et al. The Protein Data Bank: a computer-based archival file of macromolecular structures. J. Mol. Biol. 112, 535–542 (1977).
Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000).
Berman, H. M., Henrick, K. & Nakamura, H. Announcing the worldwide Protein Data Bank. Nature Struct. Biol. 10, 980 (2003).
Laskowski, R. A., Chistyakov, V. V. & Thornton, J. M. PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids. Nucleic Acids Res. 33, D266–D268 (2005).
Schwede, T., Kopp, J., Guex, N. & Peitsch, M. C. SWISS-MODEL: an automated protein-homology server. Nucleic Acids Res. 31, 3381–3385 (2003).
Pieper, U. et al. MODBASE: a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res. 32, D217–D222 (2004).
Dantzer, J, Moad, C. Heiland, R & Mooney S. MutDB services: interactive structural analysis of mutation data. Nucleic Acids Res. 33, W311–W314 (2005).
Berman, H. M. et al. The Nucleic Acid Database: a comprehensive relational database of three-dimensional structures of nucleic acids. Biophys. J. 63, 751–759 (1992).
Laskowski, R. A. in Structural Bioinformatics (eds Bourne, P. E. & Weissig, H.) 273–303 (John Wiley, New Jersey, 2003).
Kleywegt, G. J. & Jones, T. A. Phi-psi-chology: Ramachandran revisited. Structure 4, 1395–1400 (1996).
Krissinel, E. & Henrick, K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774–797 (2007).
Rosenberg, J. M., Seeman, N. C., Kim, J. J. P., Suddath, F. L., Nicholas, H. B. & Rich, A. Double helix at atomic resolution. Nature 243, 150–154 (1973).
Day, R. O., Seeman, N. C., Rosenberg, J. M. & Rich, A. A crystalline fragment of the double helix: the structure of the dinucleoside phosphate guanylyl-3′, 5′-cytidine. Proc. Natl Acad. Sci. USA 70, 849–853 (1973). References 101 and 102 proved the Watson–Crick hydrogen-bonding hypothesis between complementary DNA bases.
Wing, R. et al. Crystal structure analysis of a complete turn of B-DNA. Nature 287, 755–758 (1980).
DeLano, W. L. The PyMOL Molecular Graphics System. (DeLano Scientific, San Carlos, USA, 2002).
Author information
Authors and Affiliations
Corresponding author
Glossary
- B-DNA
-
Standard form of double-stranded DNA.
- Fold
-
The arrangement and connectivity of the regions of regular secondary structure, adopted by a given structural domain.
- Fold group
-
A grouping of similar folds, often merely referred to as a fold.
- Structural domain
-
A compact part of the protein's 3D structure that is capable of folding independently of any of a protein's other domains. Structural domains often, but not always, correspond to sequence domains.
- PDB
-
(Protein Data Bank). The archive of experimentally determined structures of proteins, RNA and fragments of DNA (see Box 1).
- Secondary structure
-
Segments of the protein chain that show a regular local structure: coiling into an α-helical segment, or extending to form a β-strand that links up side-by-side with others to form a β-sheet.
- Rational engineering
-
Use of site-directed mutagenesis to make specific alterations to an enzyme's structure, specificity or catalytic activity.
- Directed evolution
-
Cycles of error-prone PCR introduce random point mutations into a small pool of homologous genes with the selection of gene products that have required properties at each cycle.
- Pfam
-
A manually curated database of protein families.
Rights and permissions
About this article
Cite this article
Laskowski, R., Thornton, J. Understanding the molecular machinery of genetics through 3D structures. Nat Rev Genet 9, 141–151 (2008). https://doi.org/10.1038/nrg2273
Issue Date:
DOI: https://doi.org/10.1038/nrg2273
This article is cited by
-
Feature-based multiple models improve classification of mutation-induced stability changes
BMC Genomics (2014)
-
Molecular Docking and Molecular Dynamics Study on the Effect of ERCC1 Deleterious Polymorphisms in ERCC1-XPF Heterodimer
Applied Biochemistry and Biotechnology (2014)
-
Sequence-only evolutionary and predicted structural features for the prediction of stability changes in protein mutants
BMC Bioinformatics (2013)
-
RNA and protein 3D structure modeling: similarities and differences
Journal of Molecular Modeling (2011)
-
PeptideMine - A webserver for the design of peptides for protein-peptide binding studies derived from protein-protein interactomes
BMC Bioinformatics (2010)