Article | Published:

Identification of genetic variants using bar-coded multiplexed sequencing

Nature Methods volume 5, pages 887893 (2008) | Download Citation

Abstract

We developed a generalized framework for multiplexed resequencing of targeted human genome regions on the Illumina Genome Analyzer using degenerate indexed DNA bar codes ligated to fragmented DNA before sequencing. Using this method, we simultaneously sequenced the DNA of multiple HapMap individuals at several Encyclopedia of DNA Elements (ENCODE) regions. We then evaluated the use of Bayes factors for discovering and genotyping polymorphisms. For polymorphisms that were either previously identified within the Single Nucleotide Polymorphism database (dbSNP) or visually evident upon re-inspection of archived ENCODE traces, we observed a false positive rate of 11.3% using strict thresholds for predicting variants and 69.6% for lax thresholds. Conversely, false negative rates were 10.8–90.8%, with false negatives at stricter cut-offs occurring at lower coverage (<10 aligned reads). These results suggest that >90% of genetic variants are discoverable using multiplexed sequencing provided sufficient coverage at the polymorphic base.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

GenBank/EMBL/DDBJ

References

  1. 1.

    International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).

  2. 2.

    Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  3. 3.

    & Designing candidate gene and genome-wide case-control association studies. Nat. Protoc. 2, 2492–2501 (2007).

  4. 4.

    , , , & Targeted high-throughput sequencing of tagged nucleic acid samples. Nucleic Acids Res. 35, e97 (2007).

  5. 5.

    et al. A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing. Nucleic Acids Res. 35, e130 (2007).

  6. 6.

    et al. Pooled genomic indexing of rhesus macaque. Genome Res. 15, 292–301 (2005).

  7. 7.

    , , , & Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nat. Methods 5, 235–237 (2008).

  8. 8.

    ENCODE Project Consortium et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  9. 9.

    et al. Direct selection of human genomic loci by microarray hybridization. Nat. Methods 4, 903–905 (2007).

  10. 10.

    et al. Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39, 1522–1527 (2007).

  11. 11.

    et al. Multiplex amplification of large sets of human exons. Nat. Methods 4, 931–936 (2007).

  12. 12.

    et al. Microarray-based genomic selection for high-throughput resequencing. Nat. Methods 4, 907–909 (2007).

  13. 13.

    et al. Extending assembly of short DNA sequences to handle error. Bioinformatics 23, 2942–2944 (2007).

Download references

Acknowledgements

We acknowledge funding from the state of Arizona, US National Heart Lung and Blood Institute (U01 HL086528), the Stardust foundation, Science Foundation Arizona, and National Institute for Neurological Disorders and Strokes (R01 N5059873).

Author information

Author notes

    • David W Craig
    • , John V Pearson
    •  & Szabolcs Szelinger

    These authors contributed equally to this work.

Affiliations

  1. The Translational Genomics Research Institute, 445 N. 5th St. 5th Floor, Phoenix, Arizona 85004, USA.

    • David W Craig
    • , John V Pearson
    • , Szabolcs Szelinger
    • , Aswin Sekar
    • , Margot Redman
    • , Jason J Corneveaux
    • , Traci L Pawlowski
    • , Trisha Laub
    • , Dietrich A Stephan
    • , Nils Homer
    •  & Matthew J Huentelman
  2. Illumina, 9885 Town Centre Drive, San Diego, California 92121, USA.

    • Gary Nunn

Authors

  1. Search for David W Craig in:

  2. Search for John V Pearson in:

  3. Search for Szabolcs Szelinger in:

  4. Search for Aswin Sekar in:

  5. Search for Margot Redman in:

  6. Search for Jason J Corneveaux in:

  7. Search for Traci L Pawlowski in:

  8. Search for Trisha Laub in:

  9. Search for Gary Nunn in:

  10. Search for Dietrich A Stephan in:

  11. Search for Nils Homer in:

  12. Search for Matthew J Huentelman in:

Contributions

D.W.C., J.V.P., M.J.H., G.N. and D.A.S. contributed to initial experimental design. S.S., A.S., M.R., J.J.C., T.L. and T.L.P. contributed to development and execution of exact experimental protocols. J.V.P., D.W.C. and N.H. contributed to the development of bioinformatics and analysis pipelines.

Competing interests

G.N. is an employee of Illumina.

Corresponding author

Correspondence to David W Craig.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–2, Supplementary Tables 1–5, Supplementary Methods

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.1251

Further reading