Many DNA-binding factors, such as transcription factors, form oligomeric complexes with structural symmetry that bind to palindromic DNA sequences1. Palindromic consensus nucleotide sequences are also found at the genomic integration sites of retroviruses2–6 and other transposable elements7–9, and it has been suggested that this palindromic consensus arises as a consequence of the structural symmetry in the integrase complex2,3. However, we show here that the palindromic consensus sequence is not present in individual integration sites of human T-cell lymphotropic virus type 1 (HTLV-1) and human immunodeficiency virus type 1 (HIV-1), but arises in the population average as a consequence of the existence of a non-palindromic nucleotide motif that occurs in approximately equal proportions on the plus strand and the minus strand of the host genome. We develop a generally applicable algorithm to sort the individual integration site sequences into plus-strand and minus-strand subpopulations, and use this to identify the integration site nucleotide motifs of five retroviruses of different genera: HTLV-1, HIV-1, murine leukaemia virus (MLV), avian sarcoma leucosis virus (ASLV) and prototype foamy virus (PFV). The results reveal a non-palindromic motif that is shared between these retroviruses.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Nature Communications Open Access 10 January 2023
Retrovirology Open Access 14 February 2020
Mobile DNA Open Access 07 February 2018
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Pabo, C. O. & Sauer, R. T. Protein–DNA recognition. Annu. Rev. Biochem. 53, 293–321 (1984).
Wu, X., Li, Y., Crise, B., Burgess, S. M. & Munroe, D. J. Weak palindromic consensus sequences are a common feature found at the integration target sites of many retroviruses. J. Virol. 79, 5211–5214 (2005).
Holman, A. G. & Coffin, J. M. Symmetrical base preferences surrounding HIV-1, avian sarcoma/leukosis virus, and murine leukemia virus integration sites. Proc. Natl Acad. Sci. USA 102, 6103–6107 (2005).
Grandgenett, D. P. Symmetrical recognition of cellular DNA target sequences during retroviral integration. Proc. Natl Acad. Sci. USA 102, 5903–5904 (2005).
Nowrouzi, A. et al. Genome-wide mapping of foamy virus vector integrations into a human cell line. J. Gen. Virol. 87, 1339–1347 (2006).
Meekings, K. N., Leipzig, J., Bushman, F. D., Taylor, G. P. & Bangham, C. R. M. HTLV-1 integration into transcriptionally active genomic regions is associated with proviral expression and with HAM/TSP. PLoS Pathog. 4, e1000027 (2008).
Liao, G.-c., Rehm, E. J. & Rubin, G. M. Insertion site preferences of the P transposable element in Drosophila melanogaster. Proc. Natl Acad. Sci. USA 97, 3347–3351 (2000).
Gangadharan, S., Mularoni, L., Fain-Thornton, J., Wheelan, S. J. & Craig, N. L. DNA transposon Hermes inserts into DNA in nucleosome-free regions in vivo. Proc. Natl Acad. Sci. USA 107, 21966–21972 (2010).
Chatterjee, A. G. et al. Serial number tagging reveals a prominent sequence preference of retrotransposon integration. Nucleic Acids Res. 42, 8449–8460 (2014).
Lesbats, P., Engelman, A. N. & Cherepanov, P. Retroviral DNA integration. Chem. Rev. 116, 12730–12757 (2016).
Schröder, A. R. et al. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110, 521–529 (2002).
Wu, X., Li, Y., Crise, B. & Burgess, S. M. Transcription start regions in the human genome are favored targets for MLV integration. Science 300, 1749–1751 (2003).
Mitchell, R. S. et al. Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol. 2, e234 (2004).
Narezkina, A. et al. Genome-wide analyses of avian sarcoma virus integration sites. J. Virol. 78, 11656–11663 (2004).
Melamed, A. et al. Genome-wide determinants of proviral targeting, clonal abundance and expression in natural HTLV-1 infection. PLoS Pathog. 9, e1003271 (2013).
Cherepanov, P. et al. HIV-1 integrase forms stable tetramers and associates with LEDGF/p75 protein in human cells. J. Biol. Chem. 278, 372–381 (2003).
Maertens, G. et al. LEDGF/p75 is essential for nuclear and chromosomal targeting of HIV-1 integrase in human cells. J. Biol. Chem. 278, 33528–33539 (2003).
Shun, M.-C. et al. LEDGF/p75 functions downstream from preintegration complex formation to effect gene-specific HIV-1 integration. Genes Dev. 21, 1767–1778 (2007).
Derse, D. et al. Human T-cell leukemia virus type 1 integration target sites in the human genome: comparison with those of other retroviruses. J. Virol. 81, 6731–6741 (2007).
Berry, C., Hannenhalli, S., Leipzig, J. & Bushman, F. D. Selection of target sites for mobile DNA integration in the human genome. PLoS Comput. Biol. 2, e157 (2006).
Carteau, S., Hoffmann, C. & Bushman, F. Chromosome structure and human immunodeficiency virus type 1 cDNA integration: centromeric alphoid repeats are a disfavored target. J. Virol. 72, 4005–4014 (1998).
Stevens, S. W. & Griffith, J. D. Sequence analysis of the human DNA flanking sites of human immunodeficiency virus type 1 integration. J. Virol. 70, 6459–6462 (1996).
Wang, G. P., Ciuffi, A., Leipzig, J., Berry, C. C. & Bushman, F. D. HIV integration site selection: analysis by massively parallel pyrosequencing reveals association with epigenetic modifications. Genome Res. 17, 1186–1194 (2007).
Kass, R. E. & Raftery, A. E. Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995).
Maskell, D. P. et al. Structural basis for retroviral integration into nucleosomes. Nature 523, 366–369 (2015).
de Jong, J. et al. Chromatin landscapes of retroviral and transposon integration profiles. PLoS Genet. 10, e1004250 (2014).
Pryciak, P. M. & Varmus, H. E. Nucleosomes, DNA-binding proteins, and DNA sequence modulate retroviral integration target site selection. Cell 69, 769–780 (1992).
Müller, H. P. & Varmus, H. E. DNA bending creates favored sites for retroviral integration: an explanation for preferred insertion sites in nucleosomes. EMBO J. 13, 4704–4714 (1994).
Serrao, E., Ballandras-Colas, A., Cherepanov, P., Maertens, G. N. & Engelman, A. N. Key determinants of target DNA recognition by retroviral intasomes. Retrovirology 12, 39 (2015).
Maertens, G. N., Hare, S. & Cherepanov, P. The mechanism of retroviral integration from X-ray structures of its key intermediates. Nature 468, 326–329 (2010).
Tachiwana, H. et al. Structural basis of instability of the nucleosome containing a testis-specific histone variant, human H3T. Proc. Natl Acad. Sci. USA 107, 10454–10459 (2010).
Benleulmi, M. S. et al. Intasome architecture and chromatin density modulate retroviral integration into nucleosome. Retrovirology 12, 13 (2015).
Serrao, E. et al. Integrase residues that determine nucleotide preferences at sites of HIV-1 integration: implications for the mechanism of target DNA binding. Nucleic Acids Res. 42, 5164–5176 (2014).
Yin, Z. et al. Crystal structure of the Rous sarcoma virus intasome. Nature 530, 362–366 (2016).
Miyoshi, I. et al. A novel T-cell line derived from adult T-cell leukemia. Gan 71, 155–156 (1980).
Gillet, N. A. et al. The host genomic environment of the provirus determines the abundance of HTLV-1-infected T-cell clones. Blood 117, 3113–3122 (2011).
Jackman, S. pscl: Classes and Methods for R Developed in the Political Science Computational Laboratory (Stanford Univ., 2015).
R Core Team. R A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014); http://www.R-project.org/
Kuncheva, L. A stability index for feature selection. In Proceedings of the 25th International Multi-Conference on Artificial Intelligence and Applications 390–395 (2007).
Dempster, A. P., Laird, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B Met. 39, 1–38 (1977).
Aitkin, M. & Rubin, D. B. Estimation and hypothesis testing in finite mixture models. J. Roy. Stat. Soc. B. Met. 47, 67–75 (1985).
McLachlan, G. J. On bootstrapping the likelihood ratio test stastistic for the number of components in a normal mixture. J. Roy. Stat. Soc. C. Appl. Stat. 36, 318–324 (1987).
This work was supported by the Wellcome Trust UK (Senior Investigator Award 100291 to C.R.M.B.; Investigator Award 107005 to G.N.M.) and the MRC (project reference MC_UP_0801/1). The authors thank the following individuals for providing materials: A. Zhyvoloup and A. Fassati (Division of Infection and Immunity, University College London) and H. Niederer (Division of Infectious Diseases, Imperial College London). The authors also thank L. Game and M. Dore at the Medical Research Council Clinical Sciences Centre Genomics Laboratory at Hammersmith Hospital, London, UK.
The authors declare no competing financial interests.
About this article
Cite this article
Kirk, P., Huvet, M., Melamed, A. et al. Retroviruses integrate into a shared, non-palindromic DNA motif. Nat Microbiol 2, 16212 (2017). https://doi.org/10.1038/nmicrobiol.2016.212
This article is cited by
Nature Communications (2023)
Mobile DNA (2018)
Cellular and Molecular Life Sciences (2018)
Nature Reviews Genetics (2017)