Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Dissecting evolution and disease using comparative vertebrate genomics

Key Points

  • The insight gained through comparative genomics is markedly increasing as novel sequencing technologies are developed and become more affordable. Different technologies, data types and quality can be used to ask biological questions within or across species.

  • Genome annotation, incorporating coding genes, non-coding RNAs, transcription factor-binding sites, chromatin marks and more, are key to understanding genome evolution and disease. Many of these features are shared across species.

  • Genotype–phenotype correlations can be used to understand the genetic basis of numerous traits and diseases. These can be studied both on the individual, population or species level.

  • Tools and resources for genome annotation and trait mapping have made it possible to identify disease genes in many domestic animals. This information can also be used to guide the search for human disease-associated genes.

  • Low-pass population data has been successfully implemented to find natural genetic adaptations to salinity, high altitude and diet. Mutation types vary from coding to non-coding and from point mutations to structural variants.

  • Analysis of large numbers of mammals, birds or fish will enable the detection of constraint (a sign of function) as well as positive selection across clades. Convergent evolution, in which different mutations in the same genetic loci are responsible for specific phenotypic adaptation, is apparent across many lineages, including the red and giant pandas.

Abstract

With the generation of more than 100 sequenced vertebrate genomes in less than 25 years, the key question arises of how these resources can be used to inform new or ongoing projects. In the past, this diverse collection of sequences from human as well as model and non-model organisms has been used to annotate the human genome and to increase the understanding of human disease. In the future, comparative vertebrate genomics in conjunction with additional genomic resources will yield insights into the processes of genome function, evolution, speciation, selection and adaptation, as well as the quantification of species diversity. In this Review, we discuss how the genomics of non-human organisms can provide insights into vertebrate biology and how this can contribute to the understanding of human physiology and health.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: A snapshot of vertebrate genome sequencing projects.
Figure 2: Designing a sequencing project.
Figure 3: A case study in phenotypic adaptation: high altitude and EPAS1.
Figure 4: Comparative genomics for convergent evolution analyses.

Similar content being viewed by others

References

  1. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  CAS  PubMed  Google Scholar 

  2. Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl Acad. Sci. USA 101, 1916–1921 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Gibbs, R. A. et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–521 (2004).

    Article  CAS  PubMed  Google Scholar 

  4. Wade, C. M. et al. The mosaic structure of variation in the laboratory mouse genome. Nature 420, 574–578 (2002).

    Article  CAS  PubMed  Google Scholar 

  5. Lindblad-Toh, K. et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438, 803–819 (2005). This study describes the canine genome project, which addressed both comparative genome analysis and trait mapping in dogs.

    Article  CAS  PubMed  Google Scholar 

  6. Wade, C. M. et al. Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326, 865–867 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Bovine Genome Sequencing and Analysis Consortium et al. The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 324, 522–528 (2009).

  8. Rhesus Macaque Genome Sequencing and Analysis Consortium et al. Evolutionary and biomedical insights from the rhesus macaque genome. Science 316, 222–234 (2007).

  9. Mikkelsen, T. S. et al. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature 447, 167–177 (2007).

    Article  CAS  PubMed  Google Scholar 

  10. Melé, M. et al. Chromatin environment, transcriptional regulation, and splicing distinguish lincRNAs and mRNAs. Genome Res. 27, 27–37 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Valouev, A. et al. A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning. Genome Res. 18, 1051–1063 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Minoche, A. E., Dohm, J. C. & Himmelbauer, H. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol. 12, R112 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Putnam, N. H. et al. Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. Genome Res. 26, 342–350 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133–138 (2009).

    Article  CAS  PubMed  Google Scholar 

  16. Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M. & Jaffe, D. B. Direct determination of diploid genome sequences. Genome Res. 27, 757–767 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Lu, H., Giordano, F. & Ning, Z. Oxford nanopore minION sequencing and genome assembly. Genomics Proteomics Bioinformatics 14, 265–279 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Cao, H. et al. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology. Gigascience 3, 34 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Howe, K. & Wood, J. M. D. Using optical mapping data for the improvement of vertebrate genome assemblies. Gigascience 4, 10 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ganapathy, G. et al. High-coverage sequencing and annotated assemblies of the budgerigar genome. Gigascience 3, 11 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Bickhart, D. M. et al. Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat. Genet. 49, 643–650 (2017). This paper presents an example of a hybrid reference genome, with particular attention paid to gains of continuity through a combination of sequencing methods.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Goodwin, S., McPherson, J. D. & McCombie, W. R. Coming of age: ten years of next-generation sequencing technologies. Nat. Rev. Genet. 17, 333–351 (2016). This is an in-depth review of different sequencing technologies and their pros and cons.

    Article  CAS  PubMed  Google Scholar 

  24. Koepfli, K.-P., Paten, B. & O'Brien, S. J. The Genome 10K Project: a way forward. Annu. Rev. Anim. Biosci. 3, 57–111 (2014).

    Article  CAS  Google Scholar 

  25. Lamichhaney, S. et al. Evolution of Darwin's finches and their beaks revealed by genome sequencing. Nature 518, 371–375 (2015).

    Article  CAS  PubMed  Google Scholar 

  26. Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011). This study presents the comparative analysis of 29 mammals to annotate the human genome.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Hu, Y. et al. Comparative genomics reveals convergent evolution between the bamboo-eating giant and red pandas. Proc. Natl Acad. Sci. USA 114, 1081–1086 (2017). This is an elegant paper that describes convergent evolution in two distantly related pandas.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Reichwald, K. et al. High tandem repeat content in the genome of the short-lived annual fish Nothobranchius furzeri: a new vertebrate model for aging research. Genome Biol. 10, R16 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Carneiro, M. et al. Rabbit genome analysis reveals a polygenic basis for phenotypic change during domestication. Science 345, 1074–1079 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Rubin, C.-J. et al. Whole-genome resequencing reveals loci under selection during chicken domestication. Nature 464, 587–591 (2010).

    Article  CAS  PubMed  Google Scholar 

  31. Alföldi, J. et al. The genome of the green anole lizard and a comparative analysis with birds and mammals. Nature 477, 587–591 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Newman, C. E., Gregory, T. R. & Austin, C. C. The dynamic evolutionary history of genome size in North American woodland salamanders. Genome 60, 285–292 (2017).

    Article  PubMed  Google Scholar 

  33. Huang, H. W., NISC Comparative Sequencing Program, Mullikin, J. C. & Hansen, N. F. Evaluation of variant detection software for pooled next-generation sequence data. BMC Bioinformatics 16, 235 (2015). This article presents an overview of variant detection methods that are used for sweep analysis.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Oleksyk, T. K., Smith, M. W. & O'Brien, S. J. Genome-wide scans for footprints of natural selection. Phil. Trans. R. Soc. B Biol. Sci. 365, 185–205 (2010).

    Article  CAS  Google Scholar 

  35. Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358 (1984).

    CAS  PubMed  Google Scholar 

  36. Tajima, F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595 (1989).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Boyko, A. R. et al. Complex population structure in African village dogs and its implications for inferring dog domestication history. Proc. Natl Acad. Sci. USA 106, 13903–13908 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Pasaniuc, B. et al. Extremely low-coverage sequencing and imputation increases power for genome-wide association studies. Nat. Genet. 44, 631–635 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Friedenberg, S. G. & Meurs, K. M. Genotype imputation in the domestic dog. Mamm. Genome 27, 485–494 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Sargolzaei, M., Chesnais, J. P. & Schenkel, F. S. A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15, 478 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Lamichhaney, S. et al. Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax). Nat. Genet. 48, 84–88 (2016).

    Article  CAS  PubMed  Google Scholar 

  45. Pendleton, M. et al. Assembly and diploid architecture of an individual human genome via single-molecule technologies. Nat. Methods 12, 780–786 (2015). This study describes the single-molecule sequencing of a human genome, which enables the deciphering of both haplotypes and complex genomic regions.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Gordon, D. et al. Long-read sequence assembly of the gorilla genome. Science 352, aae0344 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Hoeppner, M. P. et al. An improved canine genome and a comprehensive catalogue of coding genes and non-coding transcripts. PLoS ONE 9, e91172 (2013).

    Article  CAS  Google Scholar 

  48. Ramsköld, D., Kavak, E. & Sandberg, R. How to analyze gene expression using RNA-sequencing data. Methods Mol. Biol. 802, 259–274 (2012).

    Article  CAS  PubMed  Google Scholar 

  49. Sandberg, R. Entering the era of single-cell transcriptomics in biology and medicine. Nat. Methods 11, 22–24 (2014).

    Article  CAS  PubMed  Google Scholar 

  50. Ricaño-Ponce, I. & Wijmenga, C. Mapping of immune-mediated disease genes. Annu. Rev. Genom. Hum. Genet. 14, 325–353 (2013).

    Article  CAS  Google Scholar 

  51. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Vietri Rudan, M. et al. Comparative Hi-C reveals that CTCF underlies evolution of chromosomal domain architecture. Cell Rep. 10, 1297–1309 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012). This article describes the ENCODE project, in which functional elements are assigned to the human genome.

  54. Andersson, L. et al. Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional Annotation of Animal Genomes project. Genome Biol. 16, 57 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Tuggle, C. K. et al. GO-FAANG meeting: a Gathering On Functional Annotation of Animal Genomes. Anim. Genet. 47, 528–533 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Lonsdorf, E. V. et al. Socioecological correlates of clinical signs in two communities of wild chimpanzees (Pan troglodytes) at Gombe National Park, Tanzania. Am. J. Primatol. http://dx.doi.org/10.1002/ajp.22562 (2016).

  57. Jones, F. C. et al. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484, 55–61 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Cockett, N. E. et al. Polar overdominance at the ovine callipyge locus. Science 273, 236–238 (1996).

    Article  CAS  PubMed  Google Scholar 

  59. Hutchings, M. R., Knowler, K. J., McAnulty, R. & McEwan, J. C. Genetically resistant sheep avoid parasites to a greater extent than do susceptible sheep. Proc. Biol. Sci. 274, 1839–1844 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Davis, B. W. & Ostrander, E. A. Domestic dogs and cancer research: a breed-based genomics approach. ILAR J. 55, 59–68 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Karlsson, E. K. & Lindblad-Toh, K. Leader of the pack: gene mapping in dogs and other model organisms. Nat. Rev. Genet. 9, 713–725 (2008).

    Article  CAS  PubMed  Google Scholar 

  62. Munson, L. & Moresco, A. Comparative pathology of mammary gland cancers in domestic and wild animals. Breast Dis. 28, 7–21 (2007).

    Article  CAS  PubMed  Google Scholar 

  63. Menotti-Raymond, M. & O'Brien, S. J. in Sourcebook of Models for Biomedical Research (ed. Conn, P. M. ) 221–232 (Humana Press, 2008).

    Book  Google Scholar 

  64. Soares, M. et al. Molecular based subtyping of feline mammary carcinomas and clinicopathological characterization. Breast 27, 44–51 (2016).

    Article  PubMed  Google Scholar 

  65. O'Neill, D. G. et al. Epidemiology of diabetes mellitus among 193,435 cats attending primary-care veterinary practices in England. J. Vet. Intern. Med. 30, 964–972 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Lyons, L. A. et al. Whole genome sequencing in cats, identifies new models for blindness in AIPL1 and somite segmentation in HES7. BMC Genomics 17, 265 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Yamamoto, J. K., Sanou, M. P., Abbott, J. R. & Coleman, J. K. Feline immunodeficiency virus model for designing HIV/AIDS vaccines. Curr. HIV Res. 8, 14–25 (2009).

    Article  Google Scholar 

  68. Vail, D. M. & MacEwen, E. G. Spontaneously occurring tumors of companion animals as models for human cancer. Cancer Invest. 18, 781–792 (1999).

    Article  Google Scholar 

  69. Axelsson, E. et al. The genomic signature of dog domestication reveals adaptation to a starch-rich diet. Nature 495, 360–364 (2013).

    Article  CAS  PubMed  Google Scholar 

  70. Andersson, L. S. et al. Mutations in DMRT3 affect locomotion in horses and spinal circuit function in mice. Nature 488, 642–646 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Petersen, J. L. et al. Genome-wide analysis reveals selection for important traits in domestic horse breeds. PLoS Genet. 9, e1003211 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Promerová, M. et al. Worldwide frequency distribution of the 'gait keeper' mutation in the DMRT3 gene. Anim. Genet. 45, 274–282 (2014).

    Article  CAS  PubMed  Google Scholar 

  73. Lamichhaney, S. et al. Population-scale sequencing reveals genetic differentiation due to local adaptation in Atlantic herring. Proc. Natl Acad. Sci. USA 109, 19345–19350 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Wei, C. et al. Genome-wide analysis reveals adaptation to high altitudes in Tibetan sheep. Sci. Rep. 6, 26770 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Wang, G.-D. et al. Genetic convergence in the adaptation of dogs and humans to the high-altitude environment of the Tibetan plateau. Genome Biol. Evol. 6, 2122–2128 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  76. Zhang, G. et al. Comparative genomic data of the Avian Phylogenomics Project. Gigascience 3, 26 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Jarvis, E. D. et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346, 1320–1331 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Amemiya, C. T. et al. The African coelacanth genome provides insights into tetrapod evolution. Nature 496, 311–316 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Montague, M. J. et al. Comparative analysis of the domestic cat genome reveals genetic signatures underlying feline biology and domestication. Proc. Natl Acad. Sci. USA 111, 17230–17235 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Saragusty, J. et al. Rewinding the process of mammalian extinction. Zoo Biol. 35, 280–292 (2016).

    Article  PubMed  Google Scholar 

  81. Ben-Nun, I. F. et al. Induced pluripotent stem cells from highly endangered species. Nat. Methods 8, 829–831 (2011).

    Article  CAS  PubMed  Google Scholar 

  82. Romanov, M. N. et al. The value of avian genomics to the conservation of wildlife. 10 (Suppl. 2), S10 (2009).

  83. Andrén, T. et al. in The Baltic Sea Basin (eds Harff, J., Björck, S. & Hoth, P.) 75–97 (Springer Berlin Heidelberg, 2011).

    Book  Google Scholar 

  84. Martinez-Barrio, A. et al. The genetic basis for ecological adaptation of the Atlantic herring revealed by genome sequencing. eLife 5, e12081 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Cui, Y., Sheng, Y. & Zhang, X. Genetic susceptibility to SLE: recent progress from GWAS. J. Autoimmun. 41, 25–33 (2013).

    Article  CAS  PubMed  Google Scholar 

  86. Wilbe, M. et al. Genome-wide association mapping identifies multiple loci for a canine SLE-related disease complex. Nat. Genet. 42, 250–254 (2010).

    Article  CAS  PubMed  Google Scholar 

  87. Strang, A. & Macmillan, G. The Nova Scotia Duck Tolling Retriever (Loveland, 1996).

    Google Scholar 

  88. Kozyrev, S. V. et al. Functional variants in the B cell gene BANK1 are associated with systemic lupus erythematosus. Nat. Genet. 40, 211–216 (2008).

    Article  CAS  PubMed  Google Scholar 

  89. Wilbe, M. et al. Multiple changes of gene expression and function reveal genomic and phenotypic complexity in SLE-like disease. PLoS Genet. 11, e1005248 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Eriksson, D. et al. Extended exome sequencing identifies BACH2 as a novel major risk locus for Addison's disease. J. Intern. Med. 280, 595–608 (2016).

    Article  CAS  PubMed  Google Scholar 

  91. Denas, O. et al. Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution. BMC Genomics 16, 87 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

J.R.S.M. was supported by the Swedish Research Council, FORMAS (221-2012-1531). K.L.-T. was supported by the Swedish Research Council, European Research Council (ERC) Starting Grant and Knut och Alice Wallenberg Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kerstin Lindblad-Toh.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

PowerPoint slides

Glossary

Reference genome

A high-quality species genome onto which other information is projected, such as genes, polymorphisms and elements of gene regulation.

Bacterial artificial chromosome

(BAC). Approximately 200,000 bp of sequence that has been cloned into a bacterial vector and can then be amplified and sequenced.

Whole-genome shotgun sequencing

The genome is shattered into smaller pieces and sequenced, originally with Sanger technology.

Sanger sequencing

An old standard type of sequencing in which the four bases are labelled with four fluorophores of different colours. It results in ~600 bp reads and was the methodology used for the human genome project.

High-quality draft genome assemblies

A form of genome assembly that has both long contigs (stretches of uninterrupted sequence, many kb in length) and supercontigs (structures of sequence hanging together but including smaller gaps) in the Mb range.

Haplotype blocks

Regions of the genome that are inherited together without recombination. These are characterized by high linkage disequilibrium.

Vertebrate model species

Vertebrate species that are studied to understand the biology or phenotype in another species.

Non-model organisms

Organisms that are examined to offer insight into themselves rather than principally studied to understand another trait, for example, human health.

Short-read sequencing

(SRS). Short-read technologies, such as Illumina, generate continuous sequence length of 100–250 bp.

Long-read sequencing

(LRS). Strategies such as PacBio's single-molecule real-time (SMRT) generate continuous sequence length in the order of many kilobases. Over time, these technologies will go down in price and will probably be the methods of choice.

Haplotypes

Versions of a gene or part of a gene, including several variants that are inherited together.

Chromosome interaction mapping

A methodology to analyse the 3D organization of chromatin. Looping can be functional, for example, bringing enhancers into contact with distal promoters.

Single-nucleotide polymorphisms

(SNPs). When a position in the genome can have two or more alleles. Biallelic markers are used to look for association of one allele (gene version) with disease.

RNA sequencing

(RNA-seq). The sequencing of all mRNA transcripts from a cell or tissues.

Histone marks

Histones are proteins that package DNA into units. Histone marks indicate where chromatin is open or closed and provide insights into genome regulation; for example, histone 3 lysine 27 acetylation (H3K27ac), is associated with active enhancers.

Adaptation

A trait that has changed to enable a species to function under certain circumstances or in a specific environment.

Domestication

A complex process partly driven by human selection of standing natural variation.

Convergent evolution

The independent evolution of similar features in multiple species of different lineages.

Pooled genome sequence strategies

Several individuals and/or samples can be sequenced together as a group, either with or without barcode labelling to facilitate multiplexing.

Representative genome assembly approaches

When multiple individual genomes are sequenced from a species, the best is selected as a reference genome for that organism.

N50 contig

A statistic used to illustrate genome quality. Genomes are constructed of multiple contigs (a segment of the genome assembly that contains no gaps), each with different lengths. N50 size is the shortest sequence length containing half of the genome sequence.

Long-range contiguity

The linking of large, megabase-sized genomic regions in order to create large continuous lengths of sequence data.

Conserved synteny

The similarity of gene order in large regions of related (and distant) species.

Microchromosomes

Typical in some birds and lizards, these chromosomes are less than 20 Mb in size.

Selective sweep

A region of the genome where there is little to no population-level variation, as one haplotype with favourable alleles has become more common than other variants.

Hybridization

The mating of two different species or populations, resulting in equal proportions of genetic material from both parents.

Introgression

Gene flow from one species into the gene pool of another by the repeated backcrossing of a hybrid with one of its parental species.

Integrated haplotype homozygosity score

(iHS). A method to calculate the amount of genetic similarity across regions in a species or population. High homozygosity suggests selection to be active on that region.

SNP genotyping arrays

A method to genotype predefined single-nucleotide variants distributed across the genome of the species under study. For humans, single-nucleotide polymorphism (SNP) arrays typically have millions of variants, whereas in dogs, hundreds of thousands of SNPs are used for genome-wide association mapping.

Topologically associating domains

(TADs). Regions of the genome packaged together in 3D space, most often containing one or a few genes and their regulatory signals. Genomic interactions within TADs are more frequent than those across TAD borders.

Microevolution

Changes in allele frequencies that happen within a population in a fairly short time span. This can be due to positive selection or drift.

Macroevolution

Changes in allele frequencies that happen between species over a longer time period. This can be due to positive selection or drift.

Silent changes

Changes in DNA sequence without biological consequence.

Neutral sites

Positions in DNA that are not functional and hence are free to mutate randomly.

Conserved non-coding elements

(CNEs). Regions of the genome that are not coding for proteins, but are similar in many species, suggesting a role for these elements in genome regulation.

Transposable elements

Mobile DNA sequences, similar to viruses, that can 'jump' around in the genome and integrate in new locations. These sequences can affect gene expression or give rise to novel regulatory elements.

Positive selection

The force that makes certain genetic positions change in a certain favourable direction.

Accelerated regions

Regions of the genome that are typically conserved across species, but where novel changes have happened in one or more related species. This suggests that the region is under positive selection for the novel variant (or variants).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meadows, J., Lindblad-Toh, K. Dissecting evolution and disease using comparative vertebrate genomics. Nat Rev Genet 18, 624–636 (2017). https://doi.org/10.1038/nrg.2017.51

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg.2017.51

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing