Article | Published:

BreakDancer: an algorithm for high-resolution mapping of genomic structural variation

Nature Methods volume 6, pages 677681 (2009) | Download Citation

Abstract

Detection and characterization of genomic structural variation are important for understanding the landscape of genetic variation in human populations and in complex diseases such as cancer. Recent studies demonstrate the feasibility of detecting structural variation using next-generation, short-insert, paired-end sequencing reads. However, the utility of these reads is not entirely clear, nor are the analysis methods with which accurate detection can be achieved. The algorithm BreakDancer predicts a wide variety of structural variants including insertion-deletions (indels), inversions and translocations. We examined BreakDancer's performance in simulation, in comparison with other methods and in analyses of a sample from an individual with acute myeloid leukemia and of samples from the 1,000 Genomes trio individuals. BreakDancer sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , & Structural variation in the human genome. Nat. Rev. Genet. 7, 85–97 (2006).

  2. 2.

    et al. 22q11.2 distal deletion: a recurrent genomic disorder distinct from DiGeorge syndrome and velocardiofacial syndrome. Am. J. Hum. Genet. 82, 214–221 (2008).

  3. 3.

    et al. A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures. Nat. Genet. 40, 322–328 (2008).

  4. 4.

    et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004).

  5. 5.

    Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061–1068 (2008).

  6. 6.

    , & The impact of translocations and gene fusions on cancer causation. Nat. Rev. Cancer 7, 233–245 (2007).

  7. 7.

    et al. High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 103, 4534–4539 (2006).

  8. 8.

    et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).

  9. 9.

    et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA 101, 1916–1921 (2004).

  10. 10.

    et al. Genome assembly comparison identifies structural variants in the human genome. Nat. Genet. 38, 1413–1418 (2006).

  11. 11.

    et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007).

  12. 12.

    et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008).

  13. 13.

    The impact of next-generation sequencing technology on genetics. Trends Genet. 24, 133–141 (2008).

  14. 14.

    et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008).

  15. 15.

    et al. The diploid genome sequence of an Asian individual. Nature 456, 60–65 (2008).

  16. 16.

    et al. End-sequence profiling: sequence-based analysis of aberrant genomes. Proc. Natl. Acad. Sci. USA 100, 7696–7701 (2003).

  17. 17.

    , , & Reconstructing tumor genome architectures. Bioinformatics 19 Suppl 2, ii162–ii171 (2003).

  18. 18.

    DNA sequencing. A plan to capture human diversity in 1000 genomes. Science 319, 395 (2008).

  19. 19.

    et al. Recurring mutations found by sequencing an acute myeloid leukemia genome. N. Engl. J. Med. (in the press).

  20. 20.

    , & Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).

  21. 21.

    et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).

  22. 22.

    et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).

  23. 23.

    et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).

  24. 24.

    , , & Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res. 19, 1270–1278 (2009).

  25. 25.

    , , & MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions. Nat. Methods 6, 473–474 (2009).

  26. 26.

    , & Tests of fit. in Kendall's Advanced Theory of Statistics Vol. 2A 25.37–25.43 (Arnold, London, 1999).

  27. 27.

    et al. Acquired subcytogenetic deletions and amplifications in adult acute myeloid leukemia genomes. Proc. Natl. Acad. Sci. USA (in the press).

  28. 28.

    et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, 1166–1174 (2008).

  29. 29.

    et al. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6, 99–103 (2009).

  30. 30.

    Combining independent tests of significance. Am. Stat. 2, 30 (1948).

  31. 31.

    & Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).

Download references

Acknowledgements

We thank the Genomics of AML Program Project Grant team at Washington University Medical School (US National Cancer Institute PO1 CA101937; principal investigagor, T.J.L.) and the 1,000 Genomes Consortium members for providing the data. We thank members of the 1,000 Genomes structural variation group and H. Li for methodology discussions; D. Bentley and M. Ross (Illumina), C. Alkan and J. Kidd (University of Washington), and Y. Li and H. Zheng (Beijing Genome Institute) for providing validation data; and A. Chinwalla, D. Dooling, S. Smith, J. Eldred, C. Harris, L. Cook, V. Magrini, Y. Tang, H. Schmidt, C. Haipek, G. Elliott and R. Abbott for assistance. This work was supported by the National Human Genome Research Institute (HG003079; principal investigator, R.K.W.).

Author information

Affiliations

  1. The Genome Center, Washin1gton University School of Medicine, St. Louis, Missouri, USA.

    • Ken Chen
    • , John W Wallis
    • , Michael D McLellan
    • , David E Larson
    • , Joelle M Kalicki
    • , Craig S Pohl
    • , Sean D McGrath
    • , Michael C Wendl
    • , Devin P Locke
    • , Xiaoqi Shi
    • , Robert S Fulton
    • , Timothy J Ley
    • , Richard K Wilson
    • , Li Ding
    •  & Elaine R Mardis
  2. Division of Statistical Genomics, Washington University School of Medicine, St. Louis, Missouri, USA.

    • Qunyuan Zhang

Authors

  1. Search for Ken Chen in:

  2. Search for John W Wallis in:

  3. Search for Michael D McLellan in:

  4. Search for David E Larson in:

  5. Search for Joelle M Kalicki in:

  6. Search for Craig S Pohl in:

  7. Search for Sean D McGrath in:

  8. Search for Michael C Wendl in:

  9. Search for Qunyuan Zhang in:

  10. Search for Devin P Locke in:

  11. Search for Xiaoqi Shi in:

  12. Search for Robert S Fulton in:

  13. Search for Timothy J Ley in:

  14. Search for Richard K Wilson in:

  15. Search for Li Ding in:

  16. Search for Elaine R Mardis in:

Contributions

E.R.M., R.K.W., L.D. and T.J.L.: project conception and oversight. K.C.: algorithm design and implementation. J.W.W.: variant assembly. J.M.K., M.D.M. and R.S.F.: experimental validation. C.S.P. and L.D.: primer design. S.D.M. and D.P.L.: Illumina library preparation. Q.Z. and M.C.W.: statistical insight. J.W.W., D.E.L., X.S., and D.P.L.: variant characterization and visualization. K.C., E.R.M., M.C.W., L.D. and J.W.W.: manuscript preparation.

Corresponding author

Correspondence to Ken Chen.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–12, Supplementary Tables 2, 4–8 and Supplementary Note

Excel files

  1. 1.

    Supplementary Table 1

    List of structural variants detected in simulation.

  2. 2.

    Supplementary Table 3

    A list of AML2 structural variants detected by BreakDancer, refined by local assembly and validated via PCR resequencing.

Zip files

  1. 1.

    Supplementary Software

    The BreakDancer software package encompasses two algorithms: BreakDancerMax detects large structural variants (deletions, insertions, inversions, and intra- and interchromosomal translocations), and BreakDancerMini detects small (10–100 bp) insertions and deletions.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.1363

Further reading