This protocol describes a method for converting short single-stranded and double-stranded DNA into libraries compatible with high-throughput sequencing using Illumina technology. This method has primarily been developed to improve sequence retrieval from ancient DNA, but it is also applicable to the sequencing of short or degraded DNA from other sources, and it can also be used for sequencing oligonucleotides. Single-stranded library preparation is performed by ligating a biotinylated adapter oligonucleotide to the 3′ ends of heat-denatured DNA. The resulting strands are then immobilized on streptavidin-coated beads and copied with a polymerase. A second adapter is attached by blunt-end ligation, and library preparation is completed by PCR amplification. We estimate that intact DNA strands are recovered in the library with ∼50% efficiency. Libraries can be generated from up to 12 DNA or oligonucleotide samples in parallel within 2 d.
At a glance
- Sequencing the nuclear genome of the extinct woolly mammoth. Nature 456, 387–390 (2008). et al.
- Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757–762 (2010). et al.
- A draft sequence of the Neandertal genome. Science 328, 710–722 (2010). et al.
- Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053–1060 (2010). et al.
- Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005). et al.
- Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008). et al.
- Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res. 19, 1527–1541 (2009). et al.
- An integrated semiconductor device enabling non-optical genome sequencing. Nature 475, 348–352 (2011). et al.
- Targeted retrieval and analysis of five Neandertal mtDNA genomes. Science 325, 318–321 (2009). et al.
- Targeted investigation of the Neandertal genome by array-based sequence capture. Science 328, 723–725 (2010). et al.
- Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5 e14004 (2010). , &
- Application and comparison of large-scale solution-based DNA capture-enrichment methods on ancient DNA. Sci. Rep. 1, 74 (2011). et al.
- Solid-phase reversible immobilization for the isolation of PCR products. Nucleic Acids Res. 23, 4742–4743 (1995). , &
- A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012). et al.
- Length and GC-biases during sequencing library amplification: a comparison of various polymerase-buffer systems with ancient and modern DNA sequencing libraries. Biotechniques 52, 87–94 (2012). &
- The isolation of nucleic acids from fixed, paraffin-embedded tissues—which methods are useful when? PLoS ONE 2, e537 (2007). et al.
- DNA extraction from archival formalin-fixed, paraffin-embedded tissue sections based on the antigen retrieval principle: heating under the influence of pH. J. Histochem. Cytochem. 50, 1005–1011 (2002). et al.
- Single-cell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm. Cell 148, 873–885 (2012). et al.
- Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genom. Biol. 11, R119 (2010). et al.
- Structure-independent and quantitative ligation of single-stranded DNA. Anal. Biochem. 349, 242–246 (2006). &
- Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc. 2010 pdb.prot5448 (2010). &
- Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3 (2012). , &
- Novel high-resolution characterization of ancient DNA reveals C > U-type base modification events as the sole cause of postmortem miscoding lesions. Nucleic Acids Res. 35, 5717–5728 (2007). et al.
- Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl. Acad. Sci. USA 104, 14616–14621 (2007). et al.
- Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010). et al.
- A complete mtDNA genome of an early modern human from Kostenki, Russia. Curr. Biol. 20, 231–236 (2010). et al.
- Temporal patterns of nucleotide misincorporations and DNA fragmentation in ancient DNA. PLoS ONE 7, e34131 (2012). , , , &
- Comparison and optimization of ancient DNA extraction. Biotechniques 42, 343–352 (2007). &
- Ancient DNA. Proc. Biol. Sci. 272, 3–16 (2005). &
- An efficient multistrategy DNA decontamination procedure of PCR reagents for hypersensitive PCR applications. PLoS ONE 5, e13042 (2010). et al.
- Addressing challenges in the production and analysis of Illumina sequencing data. BMC Genom. 12, 382 (2011). , &
- Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009). &
- Supplementary Figure 1 (364 KB)
Sequences of adapters, amplification and sequencing primers. Adapter sequences of a regular single-indexed Illumina multiplex library (as obtained using Illumina's TruSeq DNA sample preparation kit, cat. no. FC-121-2001/2002, or following the protocol provided in ref. 21 of the main text) are shown on top. Libraries prepared from single-stranded DNA differ by a deletion of 5 bp in the P5 adapter. Both library types are compatible with double-indexed sequencing (see ref. 22 of the main text for details). Amplification and sequencing primers are indicated by arrows. Primers with prefix 'IS' are further described in ref. 21 of the main text.
- Supplementary Figure 2 (397 KB)
Characterization of two DNA libraries prepared with and without uracil removal from a Neanderthal DNA extract (panels on the right and left, respectively). A) Fragment length distributions of the amplified libraries obtained from chip electrophoresis using the Bioanalyzer 2100. B) Fragment size distributions obtained from sequencing (the fraction of mapped sequences is indicated by a dotted line). C) Frequency of C to T substitutions around 5′and 3′ends of Neanderthal sequences. D) Average GC-content of Neanderthal sequences as a function of fragment size.
- Supplementary Figure 3 (197 KB)
Quality control of single-stranded adapter oligonucleotide CL78. From two independently synthesized batches of the oligonucleotide, 10 pmol were loaded onto a 10% denaturing polyacrylamide gel, next to a size marker (lane 1; 20/100 ladder). The gel was run for 35 min at 200V and stained with SybrGold dye. While no impurities were detected in the first batch (lane 2), the second batch (lane 3) is dominated by a double-length artifact, representing an extreme example of poor oligonucleotide synthesis quality.
- Supplementary Figure 4 (278 KB)
Determining the optimal cycle number for indexing PCR using the qPCR amplification plots. Shown are the amplification plots obtained from quantifying the libraries prepared in the experiment described in 'Anticipated results'. The saturation phase of PCR starts after cycle 18 (sample libraries, red and blue) and cycle 23 (blank library, green), respectively. Assuming full amplification efficiency (i.e. a doubling of library molecules in each cycle), the optimal cycle number for indexing PCR can be determined as follows by correcting for differences in reaction volumes and the amount of template DNA: (i) qPCR was performed in 25 μl reactions, whereas indexing PCR is performed in 100 μl volume. Thus, 2 cycles should be added to allow for 4 times more end product. (ii) One microliter of a 1:20 library dilution was used for measurement, whereas 24 μl of the library are used for indexing PCR (480 times as much). This corresponds to 8.9 (rounded 9) cycles that should be deducted. Thus, 11 and 16 were estimated to be the optimal cycle numbers for indexing PCR.
- Supplementary Table 1 (255 KB)
Program settings of Cooling-ThermoMixer MKR13 recommended for single-stranded library preparation. The device may also be used to replace the vortexer in bead resuspension steps (use 'short mix' button).