Architecture and secondary structure of an entire HIV-1 RNA genome

Article metrics


Single-stranded RNA viruses encompass broad classes of infectious agents and cause the common cold, cancer, AIDS and other serious health threats. Viral replication is regulated at many levels, including the use of conserved genomic RNA structures. Most potential regulatory elements in viral RNA genomes are uncharacterized. Here we report the structure of an entire HIV-1 genome at single nucleotide resolution using SHAPE, a high-throughput RNA analysis technology. The genome encodes protein structure at two levels. In addition to the correspondence between RNA and protein primary sequences, a correlation exists between high levels of RNA structure and sequences that encode inter-domain loops in HIV proteins. This correlation suggests that RNA structure modulates ribosome elongation to promote native protein folding. Some simple genome elements previously shown to be important, including the ribosomal gag-pol frameshift stem-loop, are components of larger RNA motifs. We also identify organizational principles for unstructured RNA regions, including splice site acceptors and hypervariable regions. These results emphasize that the HIV-1 genome and, potentially, many coding RNAs are punctuated by previously unrecognized regulatory motifs and that extensive RNA structure constitutes an important component of the genetic code.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Organization, extent of RNA structure, and relationship to protein structure for an HIV-1 genome.
Figure 2: Structure of the HIV-1 NL4-3 genome.
Figure 3: SHAPE analysis of the signal peptide–gp120 region.
Figure 4: RNA structure in Env hypervariable regions.


  1. 1

    Cann, A. J. Principles of Molecular Virology Ch. 2–5 (Elsevier, 2005)

  2. 2

    Coffin, J. M., Hughes, S. H. & Varmus, H. E. Retroviruses (Cold Spring Harbor Laboratory Press, 1997)

  3. 3

    Frankel, A. D. & Young, J. A. HIV-1: fifteen proteins and an RNA. Annu. Rev. Biochem. 67, 1–25 (1998)

  4. 4

    Damgaard, C. K., Andersen, E. S., Knudsen, B., Gorodkin, J. & Kjems, J. RNA interactions in the 5′ region of the HIV-1 genome. J. Mol. Biol. 336, 369–379 (2004)

  5. 5

    Goff, S. P. Host factors exploited by retroviruses. Nature Rev. Microbiol. 5, 253–263 (2007)

  6. 6

    Wilkinson, K. A. et al. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states. PLoS Biol. 6, e96 (2008)

  7. 7

    Levin, J. G., Guo, J., Rouzina, I. & Musier-Forsyth, K. Nucleic acid chaperone activity of HIV-1 nucleocapsid protein: critical role in reverse transcription and molecular mechanism. Prog. Nucleic Acid Res. Mol. Biol. 80, 217–286 (2005)

  8. 8

    Paillart, J. C., Skripkin, E., Ehresmann, B., Ehresmann, C. & Marquet, R. In vitro evidence for a long range pseudoknot in the 5′-untranslated and matrix coding regions of HIV-1 genomic RNA. J. Biol. Chem. 277, 5995–6004 (2002)

  9. 9

    Merino, E. J., Wilkinson, K. A., Coughlan, J. L. & Weeks, K. M. RNA structure analysis at single nucleotide resolution by selective 2′-hydroxyl acylation and primer extension (SHAPE). J. Am. Chem. Soc. 127, 4223–4231 (2005)

  10. 10

    Mortimer, S. A. & Weeks, K. M. A fast-acting reagent for accurate analysis of RNA secondary and tertiary structure by SHAPE chemistry. J. Am. Chem. Soc. 129, 4144–4145 (2007)

  11. 11

    Vasa, S. M., Guex, N., Wilkinson, K. A., Weeks, K. M. & Giddings, M. C. ShapeFinder: a software system for high-throughput quantitative analysis of nucleic acid reactivity information resolved by capillary electrophoresis. RNA 14, 1979–1990 (2008)

  12. 12

    Gherghe, C. M., Shajani, Z., Wilkinson, K. A., Varani, G. & Weeks, K. M. Strong correlation between SHAPE chemistry and the generalized NMR order parameter (S2) in RNA. J. Am. Chem. Soc. 130, 12244–12245 (2008)

  13. 13

    Pedersen, J. S., Meyer, I. M., Forsberg, R., Simmonds, P. & Hein, J. A comparative method for finding and folding RNA secondary structures within protein-coding regions. Nucleic Acids Res. 32, 4925–4936 (2004)

  14. 14

    Leitner, T. et al. HIV Sequence Compendium (Theoretical Biology and Biophysics Group, 2005)

  15. 15

    Purcell, D. F. & Martin, M. A. Alternative splicing of human immunodeficiency virus type 1 mRNA modulates viral protein expression, replication, and infectivity. J. Virol. 67, 6365–6378 (1993)

  16. 16

    Kwong, P. D. et al. Structure of an HIV gp120 envelope glycoprotein in complex with the CD4 receptor and a neutralizing human antibody. Nature 393, 648–659 (1998)

  17. 17

    Komar, A. A. A pause for thought along the co-translational folding pathway. Trends Biochem. Sci. 34, 16–24 (2009)

  18. 18

    Farabaugh, P. J. Programmed translational frameshifting. Microbiol. Rev. 60, 103–134 (1996)

  19. 19

    Wen, J. D. et al. Following translation by single ribosomes one codon at a time. Nature 452, 598–603 (2008)

  20. 20

    Nackley, A. G. et al. Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 314, 1930–1933 (2006)

  21. 21

    Hartz, D., McPheeters, D. S., Traut, R. & Gold, L. Extension inhibition analysis of translation initiation complexes. Methods Enzymol. 164, 419–425 (1988)

  22. 22

    Mathews, D. H. et al. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc. Natl Acad. Sci. USA 101, 7287–7292 (2004)

  23. 23

    Deigan, K. E., Li, T. W., Mathews, D. H. & Weeks, K. M. Accurate SHAPE-directed RNA structure prediction. Proc. Natl Acad. Sci. USA 106, 97–102 (2009)

  24. 24

    Kim, D. H. et al. Synthetic dsRNA Dicer substrates enhance RNAi potency and efficacy. Nature Biotechnol. 23, 222–226 (2005)

  25. 25

    Stein, B. S. & Engleman, E. G. Intracellular processing of the gp160 HIV-1 envelope precursor. Endoproteolytic cleavage occurs in a cis or medial compartment of the Golgi complex. J. Biol. Chem. 265, 2640–2649 (1990)

  26. 26

    Wilson, W. et al. HIV expression strategies: ribosomal frameshifting is directed by a short sequence in both mammalian and yeast systems. Cell 55, 1159–1169 (1988)

  27. 27

    Giedroc, D. P., Theimer, C. A. & Nixon, P. L. Structure, stability and function of RNA pseudoknots involved in stimulating ribosomal frameshifting. J. Mol. Biol. 298, 167–185 (2000)

  28. 28

    Biswas, P., Jiang, X., Pacchia, A. L., Dougherty, J. P. & Peltz, S. W. The human immunodeficiency virus type 1 ribosomal frameshifting site is an invariant sequence determinant and an important target for antiviral therapy. J. Virol. 78, 2082–2087 (2004)

  29. 29

    Means, R. E. et al. Ability of the V3 loop of simian immunodeficiency virus to serve as a target for antibody-mediated neutralization: correlation of neutralization sensitivity, growth in macrophages, and decreased dependence on CD4. J. Virol. 75, 3903–3915 (2001)

  30. 30

    Graham, F. L. & van der Eb, A. J. A new technique for the assay of infectivity of human adenovirus 5 DNA. Virology 52, 456–467 (1973)

  31. 31

    Adachi, A. et al. Production of acquired immunodeficiency syndrome-associated retrovirus in human and nonhuman cells transfected with an infectious molecular clone. J. Virol. 59, 284–291 (1986)

  32. 32

    Chertova, E. et al. Envelope glycoprotein incorporation, not shedding of surface envelope glycoprotein (gp120/SU), is the primary determinant of SU content of purified human immunodeficiency virus type 1 and simian immunodeficiency virus. J. Virol. 76, 5315–5325 (2002)

  33. 33

    Ott, D. E. et al. Analysis and localization of cyclophilin A found in the virions of human immunodeficiency virus type 1 MN strain. AIDS Res. Hum. Retroviruses 11, 1003–1006 (1995)

  34. 34

    Thomas, J. A. et al. Human immunodeficiency virus type 1 nucleocapsid zinc-finger mutations cause defects in reverse transcription and integration. Virology 353, 41–51 (2006)

  35. 35

    Cline, A. N., Bess, J. W., Piatak, M. & Lifson, J. D. Highly sensitive SIV plasma viral load assay: practical considerations, realistic performance expectations, and application to reverse engineering of vaccines for AIDS. J. Med. Primatol. 34, 303–312 (2005)

  36. 36

    Buckman, J. S., Bosche, W. J. & Gorelick, R. J. Human immunodeficiency virus type 1 nucleocapsid Zn2+ fingers are required for efficient reverse transcription, initial integration processes, and protection of newly synthesized viral DNA. J. Virol. 77, 1469–1480 (2003)

  37. 37

    Wilkinson, K. A., Merino, E. J. & Weeks, K. M. Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE): quantitative RNA structure analysis at single nucleotide resolution. Nature Protocols 1, 1610–1616 (2006)

  38. 38

    Badorrek, C. S. & Weeks, K. M. Architecture of a gamma retroviral genomic RNA dimer. Biochemistry 45, 12664–12672 (2006)

  39. 39

    Dowell, R. D. & Eddy, S. R. Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics 5, 71 (2004)

  40. 40

    Olsen, H. S., Nelbock, P., Cochrane, A. W. & Rosen, C. A. Secondary structure is the major determinant for interaction of HIV rev protein with RNA. Science 247, 845–848 (1990)

  41. 41

    Wilkinson, K. A. et al. Influence of nucleotide identity on ribose 2′-hydroxyl reactivity in RNA. RNA 15, 1314–1321 (2009)

  42. 42

    Badorrek, C. S. & Weeks, K. M. RNA flexibility in the dimerization domain of a gamma retrovirus. Nature Chem. Biol. 1, 104–111 (2005)

  43. 43

    Knudsen, B. & Hein, J. Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res. 31, 3423–3428 (2003)

  44. 44

    Durbin, R. & Eddy, S. Biological Sequence Analysis: Probabalistic Models Of Proteins And Nucleic Acids 356 (Cambridge Univ. Press, 1998)

  45. 45

    Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)

  46. 46

    Abecasis, A. B. et al. Recombination confounds the early evolutionary history of human immunodeficiency virus type 1: subtype G is a circulating recombinant form. J. Virol. 81, 8543–8551 (2007)

  47. 47

    The R Development Core Team. The R Foundation for Statistical Computing <> (2008)

  48. 48

    Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

  49. 49

    Chang, T. H., Horng, J. T. & Huang, H. D. RNALogo: a new approach to display structural RNA alignment. Nucleic Acids Res. 36, W91–W96 (2008)

  50. 50

    Xia, T. et al. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick pairs. Biochemistry 37, 14719–14735 (1998)

  51. 51

    Mathews, D. H., Sabina, J., Zuker, M. & Turner, D. H. Expanded sequence dependence of thermodynamic parameters provides improved prediction of RNA secondary structure. J. Mol. Biol. 288, 911–940 (1999)

  52. 52

    Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph. 14, 33–38 27–38 (1996)

Download references


This project was supported by the US National Institutes of Health (AI068462 to K.M.W.) and by the National Cancer Institute, under contracts N01-CO-12400 and HHSN261200800001E (to R.J.G. and J.W.B.). J.M.W. was supported as a Fellow of the UNC Lineberger Cancer Center and a National Institutes of Health (NIH) Kirschstein Postdoctoral Fellowship. R.S. and K.K.D. were supported by NIH grants AI44667 and T32 AI07419, respectively. We are indebted to D. Mathews and J. Low for assistance with the RNA structure program and genome secondary structure analysis, respectively. The content of this publication does not necessarily reflect the views or policies of the US Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations indicate endorsement by the US Government.

Author Contributions J.M.W., R.J.G. and K.M.W. conceived of and designed the HIV-1 genome structure analysis project. J.M.W. and K.M.W. analysed and interpreted the HIV SHAPE structure information. K.K.D., R.S. and C.L.B. designed and performed the bioinformatic pairing probability analysis. J.M.W., R.J.G. and C.W.L. performed the experiments. J.M.W., C.L.B. and K.M.W. performed the statistical analyses. J.W.B. produced and purified HIV-1 virions. J.M.W. and K.M.W. wrote the manuscript with contributions from all authors.

Author information

Correspondence to Kevin M. Weeks.

Supplementary information

Supplementary Information

This file contains Supplementary Figures S1-S7 with Legends, Supplementary Tables S1-S2 and Supplementary References. (PDF 686 kb)

Supplementary Data

This file contains dataset S1which is a helix file for the complete NL4-3 RNA genome structure model in tab-delimited ascii format. Columns are: first nucleotide in helix, last nucleotide in helix, number of base pairs in helix. (TXT 5 kb)

Supplementary Data

This file contains dataset S2 which shows all SHAPE reactivities and pairing probabilities for the NL4-3 HIV-1 RNA genome in tab-delimited ascii format. Columns are: nucleotide position, nucleotide identity, SHAPE reactivity, pairing probability (TXT 213 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Watts, J., Dang, K., Gorelick, R. et al. Architecture and secondary structure of an entire HIV-1 RNA genome. Nature 460, 711–716 (2009) doi:10.1038/nature08237

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.