Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Whole-genome sequencing and variant discovery in C. elegans

Abstract

Massively parallel sequencing instruments enable rapid and inexpensive DNA sequence data production. Because these instruments are new, their data require characterization with respect to accuracy and utility. To address this, we sequenced a Caernohabditis elegans N2 Bristol strain isolate using the Solexa Sequence Analyzer, and compared the reads to the reference genome to characterize the data and to evaluate coverage and representation. Massively parallel sequencing facilitates strain-to-reference comparison for genome-wide sequence variant discovery. Owing to the short-read-length sequences produced, we developed a revised approach to determine the regions of the genome to which short reads could be uniquely mapped. We then aligned Solexa reads from C. elegans strain CB4858 to the reference, and screened for single-nucleotide polymorphisms (SNPs) and small indels. This study demonstrates the utility of massively parallel short read sequencing for whole genome resequencing and for accurate discovery of genome-wide polymorphisms.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: N2 Bristol Solexa read analysis.
Figure 2: Accuracy distribution of N2 Bristol Solexa single-end reads.
Figure 3: Position dependency of base calling accuracy for N2 Bristol Solexa single-end reads.
Figure 4: Repetitive content in C. elegans.

Similar content being viewed by others

References

  1. C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998).

  2. Waterston, R. et al. The genome of the nematode Caenorhabditis elegans. Cold Spring Harb. Symp. Quant. Biol. 58, 367–376 (1993).

    Article  CAS  Google Scholar 

  3. Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  CAS  Google Scholar 

  4. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 431, 931–945 (2004).

  5. Harris, T.W. et al. WormBase: a multi-species resource for nematode biology and genomics. Nucleic Acids Res. 32, D411–D417 (2004).

    Article  CAS  Google Scholar 

  6. Stein, L.D. et al. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol. 1, e45 (2003).

    Article  Google Scholar 

  7. Hodgkin, J. & Doniach, T. Natural variation and copulatory plug formation in Caenorhabditis elegans. Genetics 146, 149–164 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Marth, G.T. et al. A general approach to single-nucleotide polymorphism discovery. Nat. Genet. 23, 452–456 (1999).

    Article  CAS  Google Scholar 

  9. Bieri, T. et al. WormBase: new content and better access. Nucleic Acids Res. 35, D506–D510 (2007).

    Article  CAS  Google Scholar 

  10. Tuzun, E. et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).

    Article  CAS  Google Scholar 

  11. Denver, D.R., Morris, K. & Thomas, W.K. Phylogenetics in Caenorhabditis elegans: an analysis of divergence and outcrossing. Mol. Biol. Evol. 20, 393–400 (2003).

    Article  CAS  Google Scholar 

  12. Smit, A.F. The origin of interspersed repeats in the human genome. Curr. Opin. Genet. Dev. 6, 743–748 (1996).

    Article  CAS  Google Scholar 

  13. Bhangale, T.R., Stephens, M. & Nickerson, D.A. Automating resequencing-based detection of insertion-deletion polymorphisms. Nat. Genet. 38, 1457–1462 (2006).

    Article  CAS  Google Scholar 

  14. Stephens, M., Sloan, J.S., Robertson, P.D., Scheet, P. & Nickerson, D.A. Automating sequence-based detection and genotyping of SNPs from diploid samples. Nat. Genet. 38, 375–381 (2006).

    Article  CAS  Google Scholar 

  15. Nickerson, D.A., Kolker, N., Taylor, S.L. & Rieder, M.J. Sequence-based detection of single nucleotide polymorphisms. Methods Mol. Biol. 175, 29–35 (2001).

    CAS  PubMed  Google Scholar 

  16. Koch, R., van Luenen, H.G., van der Horst, M., Thijssen, K.L. & Plasterk, R.H. Single nucleotide polymorphisms in wild isolates of Caenorhabditis elegans. Genome Res. 10, 1690–1696 (2000).

    Article  CAS  Google Scholar 

  17. Gordon, D., Abajian, C. & Green, P. Consed: a graphical tool for sequence finishing. Genome Res. 8, 195–202 (1998).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We acknowledge National Human Genome Research Institute funding (HG003079-04 to R.K.W. and HG003698 to G.T.M.). We thank K. Hall and D. Bentley of Illumina, Inc. for generously producing the paired-end read data described in the manuscript, M. Wendl for careful reading of the manuscript and T. Bieri for submitting the CB4858 variants to Wormbase.

Author information

Authors and Affiliations

Authors

Contributions

L.W.H., N2 Bristol read, coverage, variant and gap analyses; G.T.M., CB4858 SNP discovery and N2 Bristol error profile analysis; A.R.Q., CB4858 SNP discovery and validation analysis; D.D., Solexa analysis pipeline; G.F., validation assay design and analysis, D.B., Solexa base quality value analysis, P.F., preparation of N2 Bristol and CB4858 DNA, J.I.G., N2 Bristol read analysis; M.H., Solexa libraries and sequencing, W.H., microrepeat analysis, V.J.M., Solexa libraries and sequencing, R.J.R., N2 Bristol analysis; S.N.S., validation assays; D.A.S., microrepeat masking of C. elegans; M.S., Mosaik adaptation; E.F.T., microrepeat finding; T.W., N2 Bristol analysis, T.S., C. elegans strain selection; R.K.W., project origination; E.R.M., project coordination and manuscript preparation.

Corresponding author

Correspondence to Elaine R Mardis.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4, Supplementary Data, Supplementary Methods, Supplementary Table 1 (PDF 1178 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hillier, L., Marth, G., Quinlan, A. et al. Whole-genome sequencing and variant discovery in C. elegans. Nat Methods 5, 183–188 (2008). https://doi.org/10.1038/nmeth.1179

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1179

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing