The rich fossil record of equids has made them a model for evolutionary processes1. Here we present a 1.12-times coverage draft genome from a horse bone recovered from permafrost dated to approximately 560–780 thousand years before present (kyr bp)2, 3. Our data represent the oldest full genome sequence determined so far by almost an order of magnitude. For comparison, we sequenced the genome of a Late Pleistocene horse (43 kyr bp), and modern genomes of five domestic horse breeds (Equus ferus caballus), a Przewalski’s horse (E. f. przewalskii) and a donkey (E. asinus). Our analyses suggest that the Equus lineage giving rise to all contemporary horses, zebras and donkeys originated 4.0–4.5 million years before present (Myr bp), twice the conventionally accepted time to the most recent common ancestor of the genus Equus4, 5. We also find that horse population size fluctuated multiple times over the past 2 Myr, particularly during periods of severe climatic changes. We estimate that the Przewalski’s and domestic horse populations diverged 38–72 kyr bp, and find no evidence of recent admixture between the domestic horse breeds and the Przewalski’s horse investigated. This supports the contention that Przewalski’s horses represent the last surviving wild horse population6. We find similar levels of genetic variation among Przewalski’s and domestic populations, indicating that the former are genetically viable and worthy of conservation efforts. We also find evidence for continuous selection on the immune system and olfaction throughout horse evolution. Finally, we identify 29 genomic regions among horse breeds that deviate from neutrality and show low levels of genetic variation compared to the Przewalski’s horse. Such regions could correspond to loci selected early during domestication.
At a glance
Sequence Read Archive
- 2010) The Rise of Horses: 55 Million Years of Evolution (Johns Hopkins Univ. Press,
- Ancient permafrost and a future, warmer Arctic. Science 321, 1648 (2008) , , , &
- Gold Run tephra: A Middle Pleistocene stratigraphic and paleoenvironmental marker across west-central Yukon Territory, Canada. Can. J. Earth Sci. 46, 465–478 (2009) et al.
- Origins, dispersals, and migrations of Equus (Mammalia, Perissofactyla). Courier Forschungsintitut Senckenberg 153, 161–170 (1992)
- Mitochondrial-DNA timetable and the evolution of Equus: Comparison of molecular and paleontological evidence. Ann. Zool. Fenn. 28, 301–309 (1992)
- A massively parallel sequencing approach uncovers ancient origins and high genetic variability of endangered Przewalski’s horses. Genome Biol. Evol. 3, 1096–1106 (2011) et al.
- Response of permafrost to last interglacial warming: field evidence from non-glaciated Yukon and Alaska. Quat. Sci. Rev. 29, 3256–3274 (2010) , &
- True single-molecule DNA sequencing of a Pleistocene horse bone. Genet. Res. 21, 1705–1719 (2011) et al.
- Instability and decay of the primary structure of DNA. Nature 362, 709–715 (1993)
- Ancient biomolecules from deep ice cores reveal a forested southern Greenland. Science 317, 111–114 (2007) et al.
- Polar and brown bear genomes reveal ancient admixture and demographic footprints of past climate change. Proc. Natl Acad. Sci. USA 109, E2382–E2390 (2012) et al.
- Proteomic analysis of a pleistocene mammoth femur reveals more than one hundred ancient bone proteins. J. Proteome Res. 11, 917–926 (2012) et al.
- Improving the performance of True Single Molecule Sequencing for ancient DNA. BMC Genomics 13, 177 (2012) et al.
- Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757–762 (2010) et al.
- A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012) et al.
- Site-specific deamidation of glutamine: a new marker of bone collagen deterioration. Rapid Commun. Mass Spectrom. 26, 2319–2327 (2012) , , , &
- Mitochondrial phylogenomics of modern and ancient equids. PLoS ONE 8, e55950 (2013) et al.
- Cranium of Dinohippus mexicanus (Mammalia Equidae) from the early Pliocene (latest Hemphillian) of central Mexico and the origin of Equus. Bull. Florida Museum Nat.. History 43, 163–185 (2002) &
- Evolution, systematics, and phylogeography of Pleistocene horses in the new world: a molecular perspective. PLoS Biol. 3, e241 (2005) et al.
- A draft sequence of the Neandertal genome. Science 328, 710–722 (2010) et al.
- Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011) &
- Species-specific responses of Late Quaternary megafauna to climate and humans. Nature 479, 359–364 (2011) et al.
- International Union for Conservation of Nature. IUCN Red List of Threatened Species, Version 2010.1, http://www.iucnredlist.org (downloaded 11 March 2010)
- Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053–1060 (2010) et al.
- Genetic variation in Przewalski’s horses, with special focus on the last wild caught mare, 231 Orlitza III. Cytogenet. Genome Res. 102, 226–234 (2003) et al.
- Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326, 865–867 (2009) et al.
- The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proc. R. Soc. Lond. B 279, 4724–4733 (2012) et al.
- Optimized fast and sensitive acquisition methods for shotgun proteomics on a quadrupole orbitrap mass spectrometer. J. Proteome Res. 11, 3487–3497 (2012) , , , &
- Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009) &
- Revising the recent evolutionary history of equids using ancient DNA. Proc. Natl Acad. Sci. USA 106, 21754–21759 (2009) et al.
- Ancient DNA extraction from bones and teeth. Nature Protocols 2, 1756–1762 (2007) &
- Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc.. 6, http://dx.doi.org/10.1101/pdb.prot5448 (2010) &
- SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1, 18 (2012) et al.
- AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 32, W309–W312 (2004) , , &
- Draft genome sequence of the sexually transmitted pathogen Trichonomas vaginalis. Science 315, 207–212 (2007) et al.
- Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010) &
- The Sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25, 2078–2079 (2009) et al.
- A high density SNP array for the domestic horse and extant Perissodactyla: utility for association mapping, genetic diversity, and phylogeny studies. PLoS Genet. 8, e1002451 (2012) et al.
- PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007) et al.
- Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006) , &
- R Development Core Team. A language and environment for statistical computing, http://www.R-project.org (R Foundation for Statistical Computing, 2011)
- Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010) et al.
- The thermal history of human fossils and the likelihood of successful DNA amplification. J. Hum. Evol. 45, 203–217 (2003) , , , &
- mapDamage: testing for damage patterns in ancient DNA sequences. Bioinformatics 27, 2153–2155 (2011) , , , &
- Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl Acad. Sci. USA 104, 14616–14621 (2007) et al.
- MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnol. 26, 1367–1372 (2008) &
- Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011) et al.
- MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002) , , &
- Recent developments in the MAFFT multiple sequence alignment program. Brief. Bioinform. 9, 286–298 (2008) &
- RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006)
- RAxML-Light: a tool for computing Terabyte phylogenies. Bioinformatics 28, 2064–2066 (2012) et al.
- r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19, 301–302 (2003)
- CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17, 1246–1247 (2001) &
- Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication. BMC Evol. Biol. 11, 328 (2011) , , &
- Mitochondrial genomes from modern horses reveal the major haplogroups that underwent domestication. Proc. Natl Acad. Sci. USA 109, 2449–2454 (2012) et al.
- Reconstructing the origin and spread of horse domestication in the Eurasian steppe. Proc. Natl Acad. Sci. USA 109, 8202–8206 (2012) et al.
- BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007) &
- Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973 (2012) , , &
- Tracer v1. 5, http://beast.bio.ed.ac.uk/Tracer (2009) &
- Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002)
- 2006) Computational Molecular Evolution (Oxford Univ. Press,
- Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nature Protocols 4, 44–57 (2009) , &
- Molecular signatures of natural selection. Annu. Rev. Genet. 39, 197–218 (2005)
- Delete-m Jackknife for Unequal m. Stat. Comput. 9, 3–8 (1999) , &
- Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol. Biol. 6, 29 (2006) , , , &
- New algorithms and methods to estimate Maximum-Likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010) et al.
- Supplementary Information (22.1 MB)
This file contains Supplementary Text and Data, Supplementary Figures, Supplementary Tables and additional references (see Contents for details). This file was updated on 3 July 2013 to correctly display figure S1.3
- Supplementary Figures (2.1 MB)
This file contains Supplementary Figures S6.8-S6.38, which show DNA fragmentation and nucleotide misincorporation patterns for mitochondrial reads from other ancient samples analyzed in this study.
- Supplementary Tables (9.9 MB)
This zipped file contains Supplementary Tables 4.2, 4.3, 4.4, 5.9, 11.3, 11.4, 11.7 and 12.8.