Alta-Cyclic: a self-optimizing base caller for next-generation sequencing

Article metrics

Abstract

Next-generation sequencing is limited to short read lengths and by high error rates. We systematically analyzed sources of noise in the Illumina Genome Analyzer that contribute to these high error rates and developed a base caller, Alta-Cyclic, that uses machine learning to compensate for noise factors. Alta-Cyclic substantially improved the number of accurate reads for sequencing runs up to 78 bases and reduced systematic biases, facilitating confident identification of sequence variants.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Schematic representation of main Illumina noise factors.
Figure 2: Alta-Cyclic base caller data flow.
Figure 3: Comparison between Alta-Cyclic and Illumina base caller on the GAII platform.

References

  1. 1

    Pennisi, E. Science 318, 1842–1843 (2007).

  2. 2

    Chi, K.R. Nat. Methods 5, 11–14 (2008).

  3. 3

    Korbel, J.O. et al. Science 318, 420–426 (2007).

  4. 4

    Hillier, L.W. et al. Nat. Methods 5, 183–188 (2008).

  5. 5

    Cokus, S.J. et al. Nature 452, 215–219 (2008).

  6. 6

    Whiteford, N. et al. Nucleic Acids Res. 33, e171 (2005).

  7. 7

    Chaisson, M. & Pevzner, P. Genome Res. 18, 324–330 (2008).

  8. 8

    Metzker, M. Genome Res. 15, 1767–1776 (2005).

  9. 9

    Metzker, M., Raghavachari, R., Burgess, K. & Gibbs, R. Biotechniques 25, 814–817 (1998).

  10. 10

    Eisen, J.A. et al. PLoS Biol. 4, e286 (2006).

Download references

Acknowledgements

We thank M. Rooks, E. Hodges, K. Fejes-Toth and C. Malone for help in preparing libraries. We thank M. Regulski, D. Rebolini and L. Cardone for Illumina sequencing, and T. Heywood for assistance with cluster computing. F. Chen, D. Hillman and J. Eisen (Lawrence Berkeley National Lab) provided the Tetrahymena micronuclear library. Y.E. is a Goldberg-Lindsay Fellow of the Watson School of Biological Sciences. P.P.M. is a Crick-Clay Professor. G.J.H. is an investigator of the Howard Hughes Medical Institute. This work was supported by grants from the US National Institute of Health, the National Science Foundation and the Stanley Foundation.

Author information

Correspondence to Gregory J Hannon.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7, Supplementary Table 1, Supplementary Data, Supplementary Methods (PDF 941 kb)

Rights and permissions

Reprints and Permissions

About this article

Further reading