High-throughput techniques based on restriction site-associated DNA sequencing (RADseq) are enabling the low-cost discovery and genotyping of thousands of genetic markers for any species, including non-model organisms, which is revolutionizing ecological, evolutionary and conservation genetics. Technical differences among these methods lead to important considerations for all steps of genomics studies, from the specific scientific questions that can be addressed, and the costs of library preparation and sequencing, to the types of bias and error inherent in the resulting data. In this Review, we provide a comprehensive discussion of RADseq methods to aid researchers in choosing among the many different approaches and avoiding erroneous scientific conclusions from RADseq data, a problem that has plagued other genetic marker types in the past.
At a glance
- [No authors listed]. Breakthrough of the year. Scorecard. Science 330, 1608–1609 (2010).
- Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat. Rev. Genet. 12, 499–510 (2011).
Reviews methods for genomic marker discovery and genotyping using next-generation sequencing methods.
- The power and promise of population genomics: from genotyping to genome typing. Nat. Rev. Genet. 4, 981–994 (2003). , , , &
- Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3, e3376 (2008).
Introduces one of the most widely used RADseq methods, which we describe as original RAD throughout.
- Genotyping-by-sequencing in ecological and conservation genomics. Mol. Ecol. 22, 2841–2847 (2013). , , , &
- A robust, simple Genotyping-by-Sequencing (GBS) approach for high diversity species. PLoS ONE 6, e19379 (2011).
Introduces GBS, one of the most widely used RADseq methods.
- Use of restriction endonucleases to measure mitochondrial DNA sequence relatedness in natural populations. I. Population structure and evolution in the genus Peromyscus. Genetics 92, 279–295 (1979). , &
- Polymorphism in mitochondrial DNA of humans as revealed by restricion endonuclease analysis. Proc. Natl Acad. Sci. USA 77, 3605–3609 (1980).
- An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407, 513–516 (2000). et al.
- SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat. Methods 5, 247–252 (2008). et al.
- SNP discovery in swine by reduced representation and high throughput pyrosequencing. BMC Genet. 9, 81 (2008). , &
- Is RAD-seq suitable for phylogenetic inference? An in silico assessment and optimization. Ecol. Evol. 3, 846–852 (2013). , &
- Genotyping-by-sequencing for plant breeding and genetics. Plant Genome 5, 92–102 (2012). &
- Impacts of degraded DNA on restriction enzyme associated DNA sequencing (RADSeq). Mol. Ecol. Resour. 15, 1304–1315 (2015). et al.
- Local de novo assembly of RAD paired-end contigs using short sequencing reads. PLoS ONE 6, e18561 (2011).
Introduces a method for generating long contigs from paired-end RADseq data.
, , , &
- ezRAD: a simplified method for genomic genotyping in non-model organisms. PeerJ 1, e203 (2013). et al.
- Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS ONE 7, e37135 (2012).
Introduces ddRAD, one of the most widely used RADseq methods.
, , , &
- Genomic patterns of introgression in rainbow and westslope cutthroat trout illuminated by overlapping paired-end RAD sequencing. Mol. Ecol. 22, 3002–3013 (2013). et al.
- Paired-end RAD-seq for de novo assembly and marker design without available reference. Bioinformatics 27, 2187–2193 (2011). , , , &
- Linkage mapping with paralogs exposes regions of residual tetrasomic inheritance in chum salmon (Oncorhynchus keta). Mol. Ecol. Resour. http://dx.doi.org/10.1111/1755-0998.12394 (2015). , &
- RAD sequencing yields a high success rate for westslope cutthroat and rainbow trout species-diagnostic SNP assays. Mol. Ecol. Resources 12, 653–660 (2012). et al.
- RAD capture (Rapture): flexible and efficient sequence-based genotyping. BioRxiv http://dx.doi.org/10.1101/028837 (2015).
Extends RADseq with the addition of a sequence-capture step to target a subset of RAD loci, and also presents a substantially revised new version of the original RADseq protocol.
- The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010). et al.
- Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. PLoS Genet. 6, e1000862 (2010).
An early application of RADseq for population genomics, identifies loci under selection in multiple, independently derived freshwater stickleback populations.
- SNP calling, genotype calling, and sample allele frequency estimation from new-generation sequencing data. PLoS ONE 7, e37558 (2012).
Introduces Bayesian methods for SNP-calling using the sample allele frequency spectra estimated from next-generation sequencing data.
, , , &
- Quantifying population genetic differentiation from next-generation sequencing data. Genetics 195, 979–992 (2013). et al.
- Stacks: an analysis tool set for population genomics. Mol. Ecol. 22, 3124–3140 (2013).
Introduces Stacks, a widely used software package for locus discovery, genotyping and population genomic analysis using RADseq data.
, , , &
- PyRAD: assembly of de novo RADseq loci for phylogenetic analyses. Bioinformatics 30, 1844–1849 (2014).
- Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol. PLoS Genet. 9, e1003215 (2013). et al.
- TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633–2635 (2007). et al.
- Defining loci in restriction-based reduced representation genomic data from nonmodel species: sources of bias and diagnostics for optimal clustering. Biomed. Res. Int. 2014, 675158 (2014). , &
- Gene duplication, population genomics, and species-level differentiation within a tropical mountain shrub. Genome Biol. Evol. 6, 2611–2624 (2014). et al.
- Phylogenomics of phrynosomatid lizards: conflicting signals from sequence capture versus restriction site associated dna sequencing. Genome Biol. Evol. 7, 706–719 (2015). et al.
- Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008). &
- The effect of RAD allele dropout on the estimation of genetic variation within and between populations. Mol. Ecol. 22, 3165–3178 (2013).
Uses computer simulations to investigate the influence of allele dropout on population genomic statistics for RADseq data.
- RADseq underestimates diversity and introduces genealogical biases due to nonrandom haplotype sampling. Mol. Ecol. 22, 3179–3190 (2013). , , &
- Trade-offs and utility of alternative RADseq methods: reply to Puritz et al. 2014. Mol. Ecol. 23, 5943–5946 (2014). et al.
- Detection and removal of PCR duplicates in population genomic ddRAD studies by addition of a degenerate base region (dbr) in sequencing adapters. Biol. Bull. 227, 146–160 (2014). , &
- A method for counting PCR template molecules with application to next-generation sequencing. Nucleic Acids Res. 39, e81 (2011). , , &
- Degenerate adaptor sequences for detecting PCR duplicates in reduced representation sequencing data improve genotype calling accuracy. Mol. Ecol. Resour. 15, 329–336 (2015). , , &
- Special features of RAD Sequencing data: implications for genotyping. Mol. Ecol. 22, 3151–3164 (2013). et al.
- Amplification biases and consistent recovery of loci in a double-digest RAD-seq protocol. PLoS ONE 9, e106713 (2014). &
- Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res. 40, e72 (2012). &
- SimRAD: an R package for simulation-based prediction of the number of loci expected in RADseq and similar genotyping by sequencing approaches. Mol. Ecol. Resour. 14, 1314–1321 (2014). &
- Empirical assessment of RAD sequencing for interspecific phylogeny. Mol. Biol. Evol. 31, 1272–1274 (2014). et al.
- Measuring individual inbreeding in the age of genomics: marker-based measures are better than pedigrees. Heredity 115, 63–72 (2015). , &
- Population genomics of parallel hybrid zones in the mimetic butterflies, H. melpomene and H. erato. Genome Res. 24, 1316–1333 (2014). et al.
- A role for migration-linked genes and genomic islands in divergence of a songbird. Mol. Ecol. 23, 4757–4769 (2014). , , , &
- Genomic runs of homozygosity record population history and consanguinity. PLoS ONE 5, e13996 (2010). et al.
- High-throughput sequencing reveals inbreeding depression in a natural population. Proc. Natl Acad. Sci. USA 111, 3775–3780 (2014). et al.
- Ch. 1 (ed. Turner, B. J.) 1–53 (Springer, 1984). & in Evolutionary Genetics of Fishes Monographs in Evolutionary Biology
- Polyploidy and genome evolution in plants. Curr. Opin. Plant Biol. 8, 135–141 (2005). &
- Dynamics and differential proliferation of transposable elements during the evolution of the B and A genomes of wheat. Genetics 180, 1071–1086 (2008). et al.
- Mapping accuracy of short reads from massively parallel sequencing and the implications for quantitative expression profiling. PLoS ONE 4, e6323 (2009). &
- Genomics and introgression: discovery and mapping of thousands of species-diagnostic SNPs using RAD sequencing. Curr. Zool. 61, 146–154 (2015). et al.
- Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Genome Res. 21, 610–617 (2011). et al.
- Novel methods to optimize genotypic imputation for low-coverage, next-generation sequence data in crop plants. Plant Genome http://dx.doi.org/10.3835/plantgenome2014.05.0023 (2014). et al.
- Flexible and scalable genotyping-by-sequencing strategies for population studies. BMC Genomics 15, 979 (2014). et al.
- RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009). , &
- Targeted capture in evolutionary and ecological genomics. Mol. Ecol. http://dx.doi.org/10.1111/mec.13304 (2015). &
- The genomic landscape of species divergence in Ficedula flycatchers. Nature 491, 756–760 (2012). et al.
- Whole genome resequencing uncovers molecular signatures of natural and sexual selection in wild bighorn sheep. Mol. Ecol. 24, 5616–5632 (2015). et al.
- Sequencing pools of individuals-mining genome-wide polymorphism data without big funding. Nat. Rev. Genet. 15, 749–763 (2014). , , &
- Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res. 24, 688–696 (2014). et al.
- Chromosome-scale shotgun assembly using an in vitro method for long-range linkage. arXiv http://arxiv.org/abs/1502.05331 (2015). et al.
- Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers. Genome Res. 17, 240–248 (2007). , , , &
- 2b-RAD: a simple and flexible method for genome-wide genotyping. Nat. Methods 9, 808–812 (2012). , , &
- An improved 2b-RAD approach (I2b-RAD) offering genotyping tested by a rice (Oryza sativa L.) F2 population. BMC Genomics 15, 956 (2014). et al.
- Sequence-based genotyping for marker discovery and co-dominant scoring in germplasm and populations. PLoS ONE 7, e37565 (2012). et al.
- Complexity reduction of polymorphic sequences (CRoPS (TM)): a novel approach for large-scale polymorphism discovery in complex genomes. PLoS ONE 2, e1172 (2007). et al.
- Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms. BMC Genomics 15, 16 (2014). et al.
- EpiRADseq: scalable analysis of genomewide patterns of methylation using next-generation sequencing. Methods Ecol. Evol. http://dx.doi.org/10.1111/2041-210X.12435 (2015). et al.
- RESTseq — efficient benchtop population genomics with RESTriction fragment SEQuencing. PLoS ONE 8, e63960 (2013). &
- Less is more: extreme genome complexity reduction with ddRAD using ion torrent semiconductor technology. Mol. Ecol. Resour. 15, 1145–1152 (2015). et al.
- Double-digest RAD sequencing using Ion Proton semiconductor platform (ddRADseq-ion) with nonmodel organisms. Mol. Ecol. Resour. 15, 1316–1329 (2015). , , &
- Genotyping by genome reducing and sequencing for outbred animals. PLoS ONE 8, e67500 (2013). et al.
- Construction and application for QTL analysis of a restriction site associated DNA (RAD) linkage map in barley. BMC Genomics 12, 4 (2011). et al.
- Reduced representation genome sequencing suggests low diversity on the sex chromosomes of Tonkean macaque monkeys. Mol. Biol. Evol. 31, 2425–2440 (2014). , , , &
- Single-nucleotide polymorphisms (SNPs) identified through genotyping-by-sequencing improve genetic stock identification of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Can. J. Fisheries Aquat. Sci. 71, 698–708 (2014). , , , &
- Population differentiation determined from putative neutral and divergent adaptive genetic markers in Eulachon (Thaleichthys pacificus, Osmeridae), an anadromous Pacific smelt. Mol. Ecol. Resourc. 15, 1421–1434 (2015). et al.
- Exploiting genetic diversity to balance conservation and harvest of migratory salmon. Can. J. Fisheries Aquat. Sci. 70, 785–793 (2013). , , &
- Resolving postglacial phylogeography using high-throughput sequencing. Proc. Natl Acad. Sci. USA 107, 16196–16200 (2010). et al.
- Trans-Pacific RAD-Seq population genomics confirms introgressive hybridization in Eastern Pacific Pocillopora corals. Mol. Phylogenet. Evol. 88, 154–162 (2015). &
- Genomic signatures of geographic isolation and natural selection in coral reef fishes. Mol. Ecol. 24, 1543–1557 (2015). et al.
- Inferring phylogeny and introgression using RADseq data: an example from flowering plants (Pedicularis: Orobanchaceae). Syst. Biol. 62, 689–706 (2013). &
- High levels of interspecific gene flow in an endemic cichlid fish adaptive radiation from an extreme lake environment. Mol. Ecol. 24, 3421–3440 (2015). et al.
- Genome-wide RAD sequence data provide unprecedented resolution of species boundaries and relationships in the Lake Victoria cichlid adaptive radiation. Mol. Ecol. 22, 787–798 (2013). et al.
- The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics 186, 207–218 (2010). &
- Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping. Mol. Ecol. 22, 3766–3779 (2013). et al.
- Next-generation sequencing for molecular ecology: a caveat regarding pooled samples. Mol. Ecol. 23, 502–512 (2014). , &
- Empirical validation of pooled whole genome population re-sequencing in Drosophila melanogaster. PLoS ONE 7, e41901 (2012). , , &
- Population-genetic inference from pooled-sequencing data. Genome Biol. Evol. 6, 1210–1218 (2014). , , , &
- Population genomics from pool sequencing. Mol. Ecol. 22, 5561–5576 (2013). , &
- Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000). , &
- A genome scan to detect candidate regions influenced by local natural selection in human populations. Mol. Biol. Evol. 20, 893–900 (2003). , &
- Genomic scans for selective sweeps using SNP data. Genome Res. 15, 1566–1575 (2005). et al.
- Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity 107, 1–15 (2011). &
- De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013). et al.
- Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010). et al.
- Reliable identification of genomic variants from RNA-seq data. Am. J. Hum. Genet. 93, 641–651 (2013). , &
- Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 325, 318–321 (2009). et al.
- Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39, 1522–1527 (2007). et al.
- Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009). et al.
- Target-enrichment strategies for next-generation sequencing. Nat. Methods 7, 111–118 (2010). et al.
- Genetic mapping of horizontal stripes in Lake Victoria cichlid fishes: benefits and pitfalls of using RAD markers for dense linkage mapping. Mol. Ecol. 23, 5224–5240 (2014). , , &
- Comparative population genomics of the ejaculate in humans and the Great Apes. Mol. Biol. Evol. 30, 964–976 (2013). et al.
- Targeted enrichment: maximizing orthologous gene comparisons across deep evolutionary time. PLoS ONE 8, e67908 (2013). , , &
- Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC Genomics 13, 403 (2012). et al.
- Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst. Biol. 61, 717–726 (2012). et al.
- Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined with species-tree analysis. Genome Res. 22, 746–754 (2012). et al.
- Targeted investigation of the Neandertal genome by array-based sequence capture. Science 328, 723–725 (2010). et al.
- A draft genome of Yersinia pestis from victims of the Black Death. Nature 478, 506–510 (2011). et al.
- Application and comparison of large-scale solution-based DNA capture-enrichment methods on ancient DNA. Sci. Rep. 1, 74 (2011). et al.
- Pre-Columbian mycobacterial genomes reveal seals as a source of New World human tuberculosis. Nature 514, 494–497 (2014). et al.
- Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries. Am. J. Hum. Genet. 93, 852–864 (2013). et al.
- Patterns of coding variation in the complete exomes of three Neandertals. Proc. Natl Acad. Sci. USA 111, 6666–6671 (2014). et al.
- Supplementary information S1 (figure) (273 KB)
Numbers of articles citing the original papers describing each type of RADseq protocol over time.