Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Retroviruses integrate into a shared, non-palindromic DNA motif

This article has been updated


Many DNA-binding factors, such as transcription factors, form oligomeric complexes with structural symmetry that bind to palindromic DNA sequences1. Palindromic consensus nucleotide sequences are also found at the genomic integration sites of retroviruses26 and other transposable elements79, and it has been suggested that this palindromic consensus arises as a consequence of the structural symmetry in the integrase complex2,3. However, we show here that the palindromic consensus sequence is not present in individual integration sites of human T-cell lymphotropic virus type 1 (HTLV-1) and human immunodeficiency virus type 1 (HIV-1), but arises in the population average as a consequence of the existence of a non-palindromic nucleotide motif that occurs in approximately equal proportions on the plus strand and the minus strand of the host genome. We develop a generally applicable algorithm to sort the individual integration site sequences into plus-strand and minus-strand subpopulations, and use this to identify the integration site nucleotide motifs of five retroviruses of different genera: HTLV-1, HIV-1, murine leukaemia virus (MLV), avian sarcoma leucosis virus (ASLV) and prototype foamy virus (PFV). The results reveal a non-palindromic motif that is shared between these retroviruses.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Palindromic HTLV-1 and HIV-1 target integration site consensus sequences and position probability matrices (PPMs), calculated from 4,521 HTLV-1 and 13,442 HIV-1 InS sequences.
Figure 2: Distribution of adjusted palindrome index (API) scores.
Figure 3: Summary of results from fitting the two-component mixture model by maximum likelihood.

Change history

  • 14 July 2017

    In the PDF version of this article previously published, the year of publication provided in the footer of each page and in the 'How to cite' section was erroneously given as 2017, it should have been 2016. This error has now been corrected. The HTML version of the article was not affected.


  1. 1

    Pabo, C. O. & Sauer, R. T. Protein–DNA recognition. Annu. Rev. Biochem. 53, 293–321 (1984).

    CAS  Article  Google Scholar 

  2. 2

    Wu, X., Li, Y., Crise, B., Burgess, S. M. & Munroe, D. J. Weak palindromic consensus sequences are a common feature found at the integration target sites of many retroviruses. J. Virol. 79, 5211–5214 (2005).

    CAS  Article  Google Scholar 

  3. 3

    Holman, A. G. & Coffin, J. M. Symmetrical base preferences surrounding HIV-1, avian sarcoma/leukosis virus, and murine leukemia virus integration sites. Proc. Natl Acad. Sci. USA 102, 6103–6107 (2005).

    CAS  Article  Google Scholar 

  4. 4

    Grandgenett, D. P. Symmetrical recognition of cellular DNA target sequences during retroviral integration. Proc. Natl Acad. Sci. USA 102, 5903–5904 (2005).

    CAS  Article  Google Scholar 

  5. 5

    Nowrouzi, A. et al. Genome-wide mapping of foamy virus vector integrations into a human cell line. J. Gen. Virol. 87, 1339–1347 (2006).

    CAS  Article  Google Scholar 

  6. 6

    Meekings, K. N., Leipzig, J., Bushman, F. D., Taylor, G. P. & Bangham, C. R. M. HTLV-1 integration into transcriptionally active genomic regions is associated with proviral expression and with HAM/TSP. PLoS Pathog. 4, e1000027 (2008).

    Article  Google Scholar 

  7. 7

    Liao, G.-c., Rehm, E. J. & Rubin, G. M. Insertion site preferences of the P transposable element in Drosophila melanogaster. Proc. Natl Acad. Sci. USA 97, 3347–3351 (2000).

    CAS  Article  Google Scholar 

  8. 8

    Gangadharan, S., Mularoni, L., Fain-Thornton, J., Wheelan, S. J. & Craig, N. L. DNA transposon Hermes inserts into DNA in nucleosome-free regions in vivo. Proc. Natl Acad. Sci. USA 107, 21966–21972 (2010).

    CAS  Article  Google Scholar 

  9. 9

    Chatterjee, A. G. et al. Serial number tagging reveals a prominent sequence preference of retrotransposon integration. Nucleic Acids Res. 42, 8449–8460 (2014).

    CAS  Article  Google Scholar 

  10. 10

    Lesbats, P., Engelman, A. N. & Cherepanov, P. Retroviral DNA integration. Chem. Rev. 116, 12730–12757 (2016).

    CAS  Article  Google Scholar 

  11. 11

    Schröder, A. R. et al. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110, 521–529 (2002).

    Article  Google Scholar 

  12. 12

    Wu, X., Li, Y., Crise, B. & Burgess, S. M. Transcription start regions in the human genome are favored targets for MLV integration. Science 300, 1749–1751 (2003).

    CAS  Article  Google Scholar 

  13. 13

    Mitchell, R. S. et al. Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol. 2, e234 (2004).

    Article  Google Scholar 

  14. 14

    Narezkina, A. et al. Genome-wide analyses of avian sarcoma virus integration sites. J. Virol. 78, 11656–11663 (2004).

    CAS  Article  Google Scholar 

  15. 15

    Melamed, A. et al. Genome-wide determinants of proviral targeting, clonal abundance and expression in natural HTLV-1 infection. PLoS Pathog. 9, e1003271 (2013).

    CAS  Article  Google Scholar 

  16. 16

    Cherepanov, P. et al. HIV-1 integrase forms stable tetramers and associates with LEDGF/p75 protein in human cells. J. Biol. Chem. 278, 372–381 (2003).

    CAS  Article  Google Scholar 

  17. 17

    Maertens, G. et al. LEDGF/p75 is essential for nuclear and chromosomal targeting of HIV-1 integrase in human cells. J. Biol. Chem. 278, 33528–33539 (2003).

    CAS  Article  Google Scholar 

  18. 18

    Shun, M.-C. et al. LEDGF/p75 functions downstream from preintegration complex formation to effect gene-specific HIV-1 integration. Genes Dev. 21, 1767–1778 (2007).

    CAS  Article  Google Scholar 

  19. 19

    Derse, D. et al. Human T-cell leukemia virus type 1 integration target sites in the human genome: comparison with those of other retroviruses. J. Virol. 81, 6731–6741 (2007).

    CAS  Article  Google Scholar 

  20. 20

    Berry, C., Hannenhalli, S., Leipzig, J. & Bushman, F. D. Selection of target sites for mobile DNA integration in the human genome. PLoS Comput. Biol. 2, e157 (2006).

    Article  Google Scholar 

  21. 21

    Carteau, S., Hoffmann, C. & Bushman, F. Chromosome structure and human immunodeficiency virus type 1 cDNA integration: centromeric alphoid repeats are a disfavored target. J. Virol. 72, 4005–4014 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22

    Stevens, S. W. & Griffith, J. D. Sequence analysis of the human DNA flanking sites of human immunodeficiency virus type 1 integration. J. Virol. 70, 6459–6462 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23

    Wang, G. P., Ciuffi, A., Leipzig, J., Berry, C. C. & Bushman, F. D. HIV integration site selection: analysis by massively parallel pyrosequencing reveals association with epigenetic modifications. Genome Res. 17, 1186–1194 (2007).

    CAS  Article  Google Scholar 

  24. 24

    Kass, R. E. & Raftery, A. E. Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995).

    Article  Google Scholar 

  25. 25

    Maskell, D. P. et al. Structural basis for retroviral integration into nucleosomes. Nature 523, 366–369 (2015).

    CAS  Article  Google Scholar 

  26. 26

    de Jong, J. et al. Chromatin landscapes of retroviral and transposon integration profiles. PLoS Genet. 10, e1004250 (2014).

    Article  Google Scholar 

  27. 27

    Pryciak, P. M. & Varmus, H. E. Nucleosomes, DNA-binding proteins, and DNA sequence modulate retroviral integration target site selection. Cell 69, 769–780 (1992).

    CAS  Article  Google Scholar 

  28. 28

    Müller, H. P. & Varmus, H. E. DNA bending creates favored sites for retroviral integration: an explanation for preferred insertion sites in nucleosomes. EMBO J. 13, 4704–4714 (1994).

    Article  Google Scholar 

  29. 29

    Serrao, E., Ballandras-Colas, A., Cherepanov, P., Maertens, G. N. & Engelman, A. N. Key determinants of target DNA recognition by retroviral intasomes. Retrovirology 12, 39 (2015).

    Article  Google Scholar 

  30. 30

    Maertens, G. N., Hare, S. & Cherepanov, P. The mechanism of retroviral integration from X-ray structures of its key intermediates. Nature 468, 326–329 (2010).

    CAS  Article  Google Scholar 

  31. 31

    Tachiwana, H. et al. Structural basis of instability of the nucleosome containing a testis-specific histone variant, human H3T. Proc. Natl Acad. Sci. USA 107, 10454–10459 (2010).

    CAS  Article  Google Scholar 

  32. 32

    Benleulmi, M. S. et al. Intasome architecture and chromatin density modulate retroviral integration into nucleosome. Retrovirology 12, 13 (2015).

    Article  Google Scholar 

  33. 33

    Serrao, E. et al. Integrase residues that determine nucleotide preferences at sites of HIV-1 integration: implications for the mechanism of target DNA binding. Nucleic Acids Res. 42, 5164–5176 (2014).

    CAS  Article  Google Scholar 

  34. 34

    Yin, Z. et al. Crystal structure of the Rous sarcoma virus intasome. Nature 530, 362–366 (2016).

    CAS  Article  Google Scholar 

  35. 35

    Miyoshi, I. et al. A novel T-cell line derived from adult T-cell leukemia. Gan 71, 155–156 (1980).

    CAS  PubMed  Google Scholar 

  36. 36

    Gillet, N. A. et al. The host genomic environment of the provirus determines the abundance of HTLV-1-infected T-cell clones. Blood 117, 3113–3122 (2011).

    CAS  Article  Google Scholar 

  37. 37

    Jackman, S. pscl: Classes and Methods for R Developed in the Political Science Computational Laboratory (Stanford Univ., 2015).

    Google Scholar 

  38. 38

    R Core Team. R A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014);

  39. 39

    Kuncheva, L. A stability index for feature selection. In Proceedings of the 25th International Multi-Conference on Artificial Intelligence and Applications 390–395 (2007).

  40. 40

    Dempster, A. P., Laird, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B Met. 39, 1–38 (1977).

    Google Scholar 

  41. 41

    Aitkin, M. & Rubin, D. B. Estimation and hypothesis testing in finite mixture models. J. Roy. Stat. Soc. B. Met. 47, 67–75 (1985).

    Google Scholar 

  42. 42

    McLachlan, G. J. On bootstrapping the likelihood ratio test stastistic for the number of components in a normal mixture. J. Roy. Stat. Soc. C. Appl. Stat. 36, 318–324 (1987).

    Google Scholar 

Download references


This work was supported by the Wellcome Trust UK (Senior Investigator Award 100291 to C.R.M.B.; Investigator Award 107005 to G.N.M.) and the MRC (project reference MC_UP_0801/1). The authors thank the following individuals for providing materials: A. Zhyvoloup and A. Fassati (Division of Infection and Immunity, University College London) and H. Niederer (Division of Infectious Diseases, Imperial College London). The authors also thank L. Game and M. Dore at the Medical Research Council Clinical Sciences Centre Genomics Laboratory at Hammersmith Hospital, London, UK.

Author information




P.K. and C.B. conceived the project. A.M. and G.M. performed the experiments. P.K. and M.H. performed the statistical analysis and modelling. P.K. and C.B. co-wrote the paper.

Corresponding author

Correspondence to Charles R. M. Bangham.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary information

Supplementary Figures 1–5, Supplementary References (PDF 305 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kirk, P., Huvet, M., Melamed, A. et al. Retroviruses integrate into a shared, non-palindromic DNA motif. Nat Microbiol 2, 16212 (2017).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing