Hepatitis B virus (HBV) is a major cause of human hepatitis. There is considerable uncertainty about the timescale of its evolution and its association with humans. Here we present 12 full or partial ancient HBV genomes that are between approximately 0.8 and 4.5 thousand years old. The ancient sequences group either within or in a sister relationship with extant human or other ape HBV clades. Generally, the genome properties follow those of modern HBV. The root of the HBV tree is projected to between 8.6 and 20.9 thousand years ago, and we estimate a substitution rate of 8.04 × 10−6–1.51 × 10−5 nucleotide substitutions per site per year. In several cases, the geographical locations of the ancient genotypes do not match present-day distributions. Genotypes that today are typical of Africa and Asia, and a subgenotype from India, are shown to have an early Eurasian presence. The geographical and temporal patterns that we observe in ancient and modern HBV genotypes are compatible with well-documented human migrations during the Bronze and Iron Ages1,2. We provide evidence for the creation of HBV genotype A via recombination, and for a long-term association of modern HBV genotypes with humans, including the discovery of a human genotype that is now extinct. These data expose a complexity of HBV evolution that is not evident when considering modern sequences alone.

  • Subscribe to Nature for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015).

  2. 2.

    Damgaard, P. d. B. et al. 137 ancient human genomes from across the Eurasian steppes. Nature https://doi.org/10.1038/s41586-018-0094-2 (2018).

  3. 3.

    Lai, C. L., Ratziu, V., Yuen, M.-F. & Poynard, T. Viral hepatitis B. Lancet 362, 2089–2094 (2003).

  4. 4.

    Schweitzer, A., Horn, J., Mikolajczyk, R. T., Krause, G. & Ott, J. J. Estimations of worldwide prevalence of chronic hepatitis B virus infection: a systematic review of data published between 1965 and 2013. Lancet 386, 1546–1555 (2015).

  5. 5.

    Murhekar, M. V., Murhekar, K. M. & Sehgal, S. C. Epidemiology of hepatitis B virus infection among the tribes of Andaman and Nicobar Islands, India. Trans. R. Soc. Trop. Med. Hyg. 102, 729–734 (2008).

  6. 6.

    Locarnini, S., Littlejohn, M., Aziz, M. N. & Yuen, L. Possible origins and evolution of the hepatitis B virus (HBV). Semin. Cancer Biol. 23, 561–575 (2013).

  7. 7.

    Littlejohn, M., Locarnini, S. & Yuen, L. Origins and evolution of hepatitis B virus and hepatitis D virus. Cold Spring Harb. Perspect. Med. 6, a021360 (2016).

  8. 8.

    Kramvis, A. Genotypes and genetic variability of hepatitis B virus. Intervirology 57, 141–150 (2014).

  9. 9.

    Hannoun, C., Horal, P. & Lindh, M. Long-term mutation rates in the hepatitis B virus genome. J. Gen. Virol. 81, 75–83 (2000).

  10. 10.

    Zhou, Y. & Holmes, E. C. Bayesian estimates of the evolutionary rate and age of hepatitis B virus. J. Mol. Evol. 65, 197–205 (2007).

  11. 11.

    Paraskevis, D. et al. Dating the origin of hepatitis B virus reveals higher substitution rate and adaptation on the branch leading to F/H genotypes. Mol. Phylogenet. Evol. 93, 44–54 (2015).

  12. 12.

    Zehender, G. et al. Enigmatic origin of hepatitis B virus: an ancient travelling companion or a recent encounter? World J. Gastroenterol. 20, 7622–7634 (2014).

  13. 13.

    Kramvis, A. et al. Relationship of serological subtype, basic core promoter and precore mutations to genotypes/subgenotypes of hepatitis B virus. J. Med. Virol. 80, 27–46 (2008).

  14. 14.

    MacDonald, D. M., Holmes, E. C., Lewis, J. C. & Simmonds, P. Detection of hepatitis B virus infection in wild-born chimpanzees (Pan troglodytes verus): phylogenetic relationships with human and other primate genotypes. J. Virol. 74, 4253–4257 (2000).

  15. 15.

    Nielsen, R. et al. Tracing the peopling of the world through genomics. Nature 541, 302–310 (2017).

  16. 16.

    Rasmussen, S. et al. Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago. Cell 163, 571–582 (2015).

  17. 17.

    Feldman, M. et al. A high-coverage Yersinia pestis genome from a sixth-century Justinianic plague victim. Mol. Biol. Evol. 33, 2911–2923 (2016).

  18. 18.

    Reid, A. H., Fanning, T. G., Hultin, J. V. & Taubenberger, J. K. Origin and evolution of the 1918 “Spanish” influenza virus hemagglutinin gene. Proc. Natl Acad. Sci. USA 96, 1651–1656 (1999).

  19. 19.

    Duggan, A. T. et al. 17th century variola virus reveals the recent history of smallpox. Curr. Biol. 26, 3407–3412 (2016).

  20. 20.

    Kahila Bar-Gal, G. et al. Tracing hepatitis B virus to the 16th century in a Korean mummy. Hepatology 56, 1671–1680 (2012).

  21. 21.

    Patterson Ross, Z. et al. The paradox of HBV evolution as revealed from a 16th century mummy. PLoS Pathog. 14, e1006750 (2018).

  22. 22.

    Bond, W. W. et al. Survival of hepatitis B virus after drying and storage for one week. Lancet 317, 550–551 (1981).

  23. 23.

    Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757–762 (2010).

  24. 24.

    Simmonds, P. & Midgley, S. Recombination in the genesis and evolution of hepatitis B virus genotypes. J. Virol. 79, 15467–15476 (2005).

  25. 25.

    Bouckaert, R. et al. BEAST 2: a software platform for Bayesian evolutionary analysis. PLOS Comput. Biol. 10, e1003537 (2014).

  26. 26.

    Simmonds, P. Reconstructing the origins of human hepatitis viruses. Phil. Trans. R. Soc. Lond. B 356, 1013–1026 (2001).

  27. 27.

    Tedder, R. S., Bissett, S. L., Myers, R. & Ijaz, S. The ‘Red Queen’ dilemma—running to stay in the same place: reflections on the evolutionary vector of HBV in humans. Antivir. Ther. 18, 489–496 (2013).

  28. 28.

    Duchêne, S., Holmes, E. C. & Ho, S. Y. W. Analyses of evolutionary dynamics in viruses are hindered by a time-dependent bias in rate estimates. Proc. R. Soc. Lond. B 281, 20140732 (2014).

  29. 29.

    Zehender, G. et al. Reliable timescale inference of HBV genotype A origin and phylodynamics. Infect. Genet. Evol. 32, 361–369 (2015).

  30. 30.

    Hannoun, C., Söderström, A., Norkrans, G. & Lindh, M. Phylogeny of African complete genomes reveals a West African genotype A subtype of hepatitis B virus and relatedness between Somali and Asian A1 sequences. J. Gen. Virol. 86, 2163–2167 (2005).

  31. 31.

    Pickrell, J. K. et al. Ancient west Eurasian ancestry in southern and eastern Africa. Proc. Natl Acad. Sci. USA 111, 2632–2637 (2014).

  32. 32.

    Ghosh, S. et al. Unique hepatitis B virus subgenotype in a primitive tribal community in eastern India. J. Clin. Microbiol. 48, 4063–4071 (2010).

  33. 33.

    Basu, A., Sarkar-Roy, N. & Majumder, P. P. Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Proc. Natl Acad. Sci. USA 113, 1594–1599 (2016).

  34. 34.

    Drexler, J. F. et al. Bats carry pathogenic hepadnaviruses antigenically related to hepatitis B virus and capable of infecting human hepatocytes. Proc. Natl Acad. Sci. USA 110, 16151–16156 (2013).

  35. 35.

    Geer, L. Y. et al. The NCBI BioSystems database. Nucleic Acids Res. 38, D492–D496 (2010).

  36. 36.

    Bell, T. G., Yousif, M. & Kramvis, A. Bioinformatic curation and alignment of genotyped hepatitis B virus (HBV) sequence data from the GenBank public database. Springerplus 5, 1896 (2016).

  37. 37.

    Bronk Ramsey, C. Bayesian analysis of radiocarbon dates. Radiocarbon 51, 337–360 (2009).

  38. 38.

    Reimer, P. J. et al. IntCal13 and Marine13 radiocarbon age calibration curves 0–50,000 years cal bp. Radiocarbon 55, 1869–1887 (2013).

  39. 39.

    Lindgreen, S. AdapterRemoval: easy cleaning of next-generation sequencing reads. BMC Res. Notes 5, 337 (2012).

  40. 40.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  41. 41.

    Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

  42. 42.

    Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

  43. 43.

    Drosten, C., Weber, M., Seifried, E. & Roth, W. K. Evaluation of a new PCR assay with competitive internal control sequence for blood donor screening. Transfusion 40, 718–724 (2000).

  44. 44.

    Willerslev, E. & Cooper, A. Review Paper. Ancient DNA. Proc. R. Soc. Lond. B 272, 3–16 (2005).

  45. 45.

    Jónsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F. & Orlando, L. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics 29, 1682–1684 (2013).

  46. 46.

    Orlando, L., Gilbert, M. T. P. & Willerslev, E. Reconstructing ancient genomes and epigenomes. Nat. Rev. Genet. 16, 395–408 (2015).

  47. 47.

    Briggs, A. W. et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010).

  48. 48.

    Kearse, M. et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012).

  49. 49.

    Needleman, S. B. & Wunsch, C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970).

  50. 50.

    Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000).

  51. 51.

    Martin, D. P., Murrell, B., Golden, M., Khoosal, A. & Muhire, B. RDP4: detection and analysis of recombination patterns in virus genomes. Virus Evol. 1, vev003 (2015).

  52. 52.

    Martin, D. & Rybicki, E. RDP: detection of recombination amongst aligned sequences. Bioinformatics 16, 562–563 (2000).

  53. 53.

    Padidam, M., Sawyer, S. & Fauquet, C. M. Possible emergence of new geminiviruses by frequent recombination. Virology 265, 218–225 (1999).

  54. 54.

    Martin, D. P., Posada, D., Crandall, K. A. & Williamson, C. A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints. AIDS Res. Hum. Retroviruses 21, 98–102 (2005).

  55. 55.

    Smith, J. M. Analyzing the mosaic structure of genes. J. Mol. Evol. 34, 126–129 (1992).

  56. 56.

    Posada, D. & Crandall, K. A. Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc. Natl Acad. Sci. USA 98, 13757–13762 (2001).

  57. 57.

    Gibbs, M. J., Armstrong, J. S. & Gibbs, A. J. Sister-scanning: a Monte Carlo procedure for assessing signals in recombinant sequences. Bioinformatics 16, 573–582 (2000).

  58. 58.

    Boni, M. F., Posada, D. & Feldman, M. W. An exact nonparametric method for inferring mosaic structure in sequence triplets. Genetics 176, 1035–1047 (2007).

  59. 59.

    Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

  60. 60.

    Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).

  61. 61.

    Ronquist, F. & Huelsenbeck, J. P. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574 (2003).

  62. 62.

    Rambaut, A., Lam, T. T., Max Carvalho, L. & Pybus, O. G. Exploring the temporal structure of heterochronous sequences using TempEst (formerly Path-O-Gen). Virus Evol. 2, vew007 (2016).

  63. 63.

    Bouckaert, R. R. & Drummond, A. J. bModelTest: Bayesian phylogenetic site model averaging and model comparison. BMC Evol. Biol. 17, 42 (2017).

  64. 64.

    Duchêne, S., Duchêne, D., Holmes, E. C. & Ho, S. Y. W. The performance of the date-randomization test in phylogenetic analyses of time-structured virus data. Mol. Biol. Evol. 32, 1895–1906 (2015).

  65. 65.

    Kass, R. E. & Raftery, A. E. Bayes Factors. J. Am. Stat. Assoc. 90, 773–795 (1995).

  66. 66.

    Rambaut, A., Suchard, M. A., Xie, D. & Drummond, A. J. Tracer v1.6. https://github.com/beast-dev/tracer/releases/tag/v1.6 (2017).

  67. 67.

    Sanchez, G. et al. Human (Clovis)–gomphothere (Cuvieronius sp.) association 13,390 calibrated yBP in Sonora, Mexico. Proc. Natl Acad. Sci. USA 111, 10972–10977 (2014).

  68. 68.

    Bourgeon, L., Burke, A. & Higham, T. Earliest human presence in North America dated to the Last Glacial Maximum: new radiocarbon dates from Bluefish Caves, Canada. PLoS ONE 12, e0169486 (2017).

  69. 69.

    Andernach, I. E., Nolte, C., Pape, J. W. & Muller, C. P. Slave trade and hepatitis B virus genotypes and subgenotypes in Haiti and Africa. Emerg. Infect. Dis. 15, 1222–1228 (2009).

  70. 70.

    Kayser, M. et al. Melanesian and Asian origins of Polynesians: mtDNA and Y chromosome gradients across the Pacific. Mol. Biol. Evol. 23, 2234–2244 (2006).

Download references


B.B. thanks D. Tserendulam for help, wisdom and guidance. We thank S. Rankin and the staff of the University of Cambridge High Performance Computing service and the National High-throughput Sequencing Centre (Copenhagen). This work was supported by: The Danish National Research Foundation, The Danish National Advanced Technology Foundation (The Genome Denmark platform, grant 019-2011-2), The Villum Kann Rasmussen Foundation, KU2016, European Union FP7 programme ANTIGONE (grant agreement No. 278976), and European Union Horizon 2020 research and innovation programmes, COMPARE (grant agreement No. 643476), VIROGENESIS (grant agreement No. 634650). The National Reference Center for Hepatitis B and D Viruses is supported by the German Ministry of Health via the Robert Koch Institute (Berlin). B.B. was supported by Taylor Family-Asia Foundation Endowed Chair in Ecology and Conservation Biology. A.D.M.E.O. was supported by N-RENNT of the Ministry of Science and Culture of Lower Saxony, Germany.

Reviewer information

Nature thanks P. Simmonds, B. Shapiro, C. Pepperell and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Author notes

  1. These authors contributed equally: Barbara Mühlemann, Terry C. Jones, Peter de Barros Damgaard, Morten E. Allentoft.


  1. Center for Pathogen Evolution, Department of Zoology, University of Cambridge, Cambridge, UK

    • Barbara Mühlemann
    • , Terry C. Jones
    •  & Derek J. Smith
  2. Institute of Virology, Charité, Universitätsmedizin Berlin, Berlin, Germany

    • Terry C. Jones
    •  & Christian Drosten
  3. Centre for GeoGenetics, Natural History Museum, University of Copenhagen, Copenhagen, Denmark

    • Peter de Barros Damgaard
    • , Morten E. Allentoft
    • , Anders J. Hansen
    • , Ludovic Orlando
    • , Martin Sikora
    • , Lasse Vinner
    •  & Eske Willerslev
  4. Archaeological Laboratory, Faculty of History and Law, A. A. Baitursynov Kostanay State University, Kostanay, Kazakhstan

    • Irina Shevnina
    •  & Andrey Logvin
  5. Saryarka Archaeological Institute, Karaganda State University, Karaganda, Kazakhstan

    • Emma Usmanova
  6. Laboratory of Tree-Ring Research, University of Arizona, Tucson, AZ, USA

    • Irina P. Panyushkina
  7. Department of Biology, School of Arts and Sciences, National University of Mongolia, Ulaanbaatar, Mongolia

    • Bazartseren Boldgiv
  8. Laboratory of Virology, Institute of Veterinary Medicine, Mongolian University of Life Sciences, Ulaanbaatar, Mongolia

    • Tsevel Bazartseren
  9. National Academy of Sciences, Bishkek, Kyrgyzstan

    • Kadicha Tashbaeva
  10. Pavlodar State University, Pavlodar, Kazakhstan

    • Victor Merz
  11. Centre for Baltic and Scandinavian Archaeology, Schleswig, Germany

    • Nina Lau
  12. Institute for History of Medicine and Foreign Languages of the First Faculty of Medicine, Charles University, Prague, Czech Republic

    • Václav Smrčka
  13. Margulan Institute of Archaeology, Almaty, Kazakhstan

    • Dmitry Voyakin
  14. N. N. Miklouho-Maklay Institute of Ethnology and Anthropology, Russian Academy of Sciences, Moscow, Russia

    • Egor Kitov
  15. South Ural Department, Institute of History and Archaeology UBRAS, South Ural State University, Chelyabinsk, Russia

    • Andrey Epimakhov
  16. Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden

    • Dalia Pokutta
  17. Matrica Museum, Százhalombatta, Hungary

    • Magdolna Vicze
  18. Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden

    • T. Douglas Price
    • , Karl-Göran Sjögren
    •  & Kristian Kristiansen
  19. Department of Physical Anthropology, Peter the Great Museum of Anthropology and Ethnography, Saint-Petersburg, Russia

    • Vyacheslav Moiseyev
  20. Laboratoire d’Anthropobiologie Moléculaire et d’Imagerie de Synthèse, CNRS UMR 5288, Université de Toulouse, Université Paul Sabatier, Toulouse, France

    • Ludovic Orlando
  21. Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, Denmark

    • Simon Rasmussen
  22. Research Center for Emerging Infections and Zoonoses, University of Veterinary Medicine Hannover, Hannover, Germany

    • Albert D. M. E. Osterhaus
  23. Institute of Medical Virology, Justus Liebig University of Giessen, Giessen, Germany

    • Dieter Glebe
  24. National Reference Centre for Hepatitis B and D Viruses, German Center for Infection Research (DZIF), Giessen, Germany

    • Dieter Glebe
  25. Department of Viroscience, Erasmus Medical Centre, Rotterdam, The Netherlands

    • Ron A. M. Fouchier
  26. German Center for Infection Research (DZIF), Braunschweig, Germany

    • Christian Drosten
  27. Cambridge GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK

    • Eske Willerslev
  28. Wellcome Trust Sanger Institute, Hinxton, UK

    • Eske Willerslev


  1. Search for Barbara Mühlemann in:

  2. Search for Terry C. Jones in:

  3. Search for Peter de Barros Damgaard in:

  4. Search for Morten E. Allentoft in:

  5. Search for Irina Shevnina in:

  6. Search for Andrey Logvin in:

  7. Search for Emma Usmanova in:

  8. Search for Irina P. Panyushkina in:

  9. Search for Bazartseren Boldgiv in:

  10. Search for Tsevel Bazartseren in:

  11. Search for Kadicha Tashbaeva in:

  12. Search for Victor Merz in:

  13. Search for Nina Lau in:

  14. Search for Václav Smrčka in:

  15. Search for Dmitry Voyakin in:

  16. Search for Egor Kitov in:

  17. Search for Andrey Epimakhov in:

  18. Search for Dalia Pokutta in:

  19. Search for Magdolna Vicze in:

  20. Search for T. Douglas Price in:

  21. Search for Vyacheslav Moiseyev in:

  22. Search for Anders J. Hansen in:

  23. Search for Ludovic Orlando in:

  24. Search for Simon Rasmussen in:

  25. Search for Martin Sikora in:

  26. Search for Lasse Vinner in:

  27. Search for Albert D. M. E. Osterhaus in:

  28. Search for Derek J. Smith in:

  29. Search for Dieter Glebe in:

  30. Search for Ron A. M. Fouchier in:

  31. Search for Christian Drosten in:

  32. Search for Karl-Göran Sjögren in:

  33. Search for Kristian Kristiansen in:

  34. Search for Eske Willerslev in:


All authors contributed to the interpretation of the results. B.M., T.C.J., P.d.B.D., M.E.A., S.R., M.S., L.O., L.V., D.J.S., D.G., R.A.M.F., C.D. and E.W. wrote the paper. B.M. and T.C.J. screened and analysed data, and created display items. P.d.B.D. and M.E.A. conducted sampling and generated sequence data. I.S., A.L., E.U., I.P.P., B.B., T.B., K.T., V.M., N.L., V.S., D.V., E.K., A.E., D.P., M.V., T.D.P. and V.M. excavated, curated, and analysed samples and archaeological contexts. A.J.H. designed virus capture probes. L.V. designed virus capture probes, performed TaqMan PCR and target enrichment experiments. A.D.M.E.O. initiated and provided critical input on the development of next-generation sequencing bioinformatics tools. B.M., T.C.J., D.J.S., D.G., R.A.M.F. and C.D. performed recombination analysis. C.D. analysed data and designed PCR probes. K.-G.S. and K.K. conducted sampling and provided archaeological background. E.W. initiated the work, and led sampling and generation of the sequence data.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Eske Willerslev.

Extended data figures and tables

  1. Extended Data Fig. 1 aDNA damage patterns.

    The frequencies of the mismatches observed between the HBV reference sequences (Supplementary Table 3) and the reads are shown as a function of distance from the 5′ end. C > T (5′) and G > A (3′) mutations are shown in red and blue, respectively. All other possible mismatches are shown in grey. Insertions are shown in purple, deletions in green and clippings in orange. The count of reads matching HBV for each sample is shown in parentheses. a, Damage patterns for RISE563, DA222, DA119, RISE254, DA195, DA27, DA51, RISE386, RISE387, DA29, DA45, RISE416 and RISE154. b, Damage patterns for DA222 without (left) and with (right) USER treatment. c, Damage patterns with 10, 20, 50, 100, 200, 500 and 1,000 reads sampled from RISE563, in which each opaque line corresponds to one replicate set of reads.

  2. Extended Data Fig. 2 Hepadnaviridae maximum likelihood tree.

    This figure shows 26 Orthohepadnaviridae sequences (dataset 1, see Methods), including the ancient HBV sequences. Ancient genotype A sequences are shown in red, the ancient genotype B sequence in orange, ancient genotype D sequences in blue and novel genotype sequences in green. The tree was constructed in PhyML60, optimizing for topology, branch lengths and rates, with 100 bootstraps (see Methods). Internal nodes with < 70% bootstrap support are shown as polytomies.

  3. Extended Data Fig. 3 Genotype A recombination break-point evidence.

    RDP451 was used to analyse the set of 12 ancient sequences plus a representative set of 15 modern human and non-human primate sequences (see Methods). The seven recombination programs used by RDP4 suggested that all genotype A sequences are recombinants, with the genotype D sequence HBV-DA51 as the minor parent and an unknown major parent. The obvious interpretation is that recombination formed an ancestor of the oldest sequences, evidence of which is still present in the less-ancient and the modern representatives. The figure shows the graphical evidence and predicted recombination break-point distribution for the two oldest genotype A sequences, HBV-RISE386 and HBV-RISE387, according to three of the RDP4 methods (MaxChi, Bootscan and RDP). In all subplots, the predicted location of the break points is shown as a dashed vertical line and the surrounding grey area shows the 99% confidence interval for the break point. Subplots on the same row share their y axis and those in the same column share their x axis. a, HBV-RISE386 analysed by MaxChi. b, HBV-RISE386 analysed by Bootscan. c, HBV-RISE386 analysed by RDP. d, HBV-RISE387 analysed by MaxChi. e, HBV-RISE387 analysed by Bootscan. f, HBV-RISE387 analysed by RDP.

  4. Extended Data Fig. 4 HBV maximum likelihood tree.

    The sequences from dataset 2 (see Methods) and the ancient sequences were aligned in MAFFT59. The tree was constructed in PhyML60, optimizing for topology, branch lengths and rates, with 100 bootstraps (see Methods). Internal nodes with < 70% bootstrap support are shown as polytomies. Ancient genotype A sequences are shown in red, ancient genotype B sequences in orange, ancient genotype D sequences in blue and novel genotype sequences in green. Taxon names indicate: genotype or subgenotype, GenBank accession number, age, abbreviation of country of sequence origin, region of sequence origin, host species and optional additional remarks. Note that the maximum likelihood tree shows topological uncertainty (polytomies) in areas where the BEAST225 tree (Fig. 2) is well resolved. This is the case for two reasons. First, BEAST2 always produces a fully resolved binary topology without polytomies. Second, and more important, BEAST2 creates a time tree and uses tip dates to constrain the possible topologies under consideration. Thus, BEAST2 can know that certain topologies are unlikely or impossible, whereas maximum likelihood cannot and thus inherently has greater uncertainty regarding tree topology.

  5. Extended Data Fig. 5 Root-to-tip regression and date randomization tests.

    a, Regression of root-to-tip distances and ages performed in Scipy (http://www.scipy.org). One hundred and twenty-four branch lengths were extracted using TempEst62 from trees inferred using neighbour joining, maximum likelihood and Bayesian methods. Shaded areas show 95% confidence intervals. Slopes are 1.01 × 10−5, 1.20 × 10−5 and 4.21 × 10−6, and correlation coefficients are 0.45 (R2 = 0.2), 0.36 (R2 = 0.13) and 0.51 (R2 = 0.26), for maximum likelihood, Bayesian and neighbour joining trees, respectively. b, Date randomization tests under the strict clock model. The median and 95% HPD interval for the substitution rates are given. The rate for the correctly dated tree is shown in red. Dates were randomized within all sequences, within the ancient sequences only, and within each genotype. We performed three replicates of each. None of the 95% HPD intervals for the randomized runs overlaps with the 95% HPD intervals for the correctly dated runs, suggesting the presence of a temporal signal in the data.

  6. Extended Data Table 1 Extended overview of samples with reads matching HBV and PCR results
  7. Extended Data Table 2 Genotype A predicted recombination break points and P values
  8. Extended Data Table 3 Model testing and inferred age of genotypes
  9. Extended Data Table 4 Consensus sequence identity

Supplementary information

  1. Supplementary Information

    This file is in PDF format and contains: Three Supplementary Tables: SI Tables 1 and 2 describe the number of reference genomes and accession numbers of sequences used to design capture probes. SI Table 3 contains additional information for the HBV positive samples. A Supplementary Methods section, showing: 1) An investigation into the dependence of damage patterns on the number of reads, 2) Lists of accession numbers for sequences included in the different analyses, and 3) The three phylogenetic trees used for the regression analysis, inferred using neighbour joining, maximum likelihood and Bayesian methods.

  2. Reporting Summary


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.