Abstract
Detection and characterization of genomic structural variation are important for understanding the landscape of genetic variation in human populations and in complex diseases such as cancer. Recent studies demonstrate the feasibility of detecting structural variation using next-generation, short-insert, paired-end sequencing reads. However, the utility of these reads is not entirely clear, nor are the analysis methods with which accurate detection can be achieved. The algorithm BreakDancer predicts a wide variety of structural variants including insertion-deletions (indels), inversions and translocations. We examined BreakDancer's performance in simulation, in comparison with other methods and in analyses of a sample from an individual with acute myeloid leukemia and of samples from the 1,000 Genomes trio individuals. BreakDancer sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Feuk, L., Carson, A.R. & Scherer, S.W. Structural variation in the human genome. Nat. Rev. Genet. 7, 85–97 (2006).
Ben-Shachar, S. et al. 22q11.2 distal deletion: a recurrent genomic disorder distinct from DiGeorge syndrome and velocardiofacial syndrome. Am. J. Hum. Genet. 82, 214–221 (2008).
Sharp, A.J. et al. A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures. Nat. Genet. 40, 322–328 (2008).
Futreal, P.A. et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004).
Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061–1068 (2008).
Mitelman, F., Johansson, B. & Mertens, F. The impact of translocations and gene fusions on cancer causation. Nat. Rev. Cancer 7, 233–245 (2007).
Urban, A.E. et al. High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 103, 4534–4539 (2006).
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).
Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl. Acad. Sci. USA 101, 1916–1921 (2004).
Khaja, R. et al. Genome assembly comparison identifies structural variants in the human genome. Nat. Genet. 38, 1413–1418 (2006).
Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007).
Wheeler, D.A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008).
Mardis, E.R. The impact of next-generation sequencing technology on genetics. Trends Genet. 24, 133–141 (2008).
Bentley, D.R. et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008).
Wang, J. et al. The diploid genome sequence of an Asian individual. Nature 456, 60–65 (2008).
Volik, S. et al. End-sequence profiling: sequence-based analysis of aberrant genomes. Proc. Natl. Acad. Sci. USA 100, 7696–7701 (2003).
Raphael, B.J., Volik, S., Collins, C. & Pevzner, P.A. Reconstructing tumor genome architectures. Bioinformatics 19 Suppl 2, ii162–ii171 (2003).
Kaiser, J. DNA sequencing. A plan to capture human diversity in 1000 genomes. Science 319, 395 (2008).
Mardis, E.R. et al. Recurring mutations found by sequencing an acute myeloid leukemia genome. N. Engl. J. Med. (in the press).
Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).
Kidd, J.M. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).
Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).
Tuzun, E. et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).
Hormozdiari, F., Alkan, C., Eichler, E.E. & Sahinalp, S.C. Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res. 19, 1270–1278 (2009).
Lee, S., Hormozdiari, F., Alkan, C. & Brudno, M. MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions. Nat. Methods 6, 473–474 (2009).
Stuart, A., Ord, K. & Arnold, S. Tests of fit. in Kendall's Advanced Theory of Statistics Vol. 2A 25.37–25.43 (Arnold, London, 1999).
Walter, M.J. et al. Acquired subcytogenetic deletions and amplifications in adult acute myeloid leukemia genomes. Proc. Natl. Acad. Sci. USA (in the press).
McCarroll, S.A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, 1166–1174 (2008).
Chiang, D.Y. et al. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6, 99–103 (2009).
Fisher, R.A. Combining independent tests of significance. Am. Stat. 2, 30 (1948).
Zerbino, D.R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).
Acknowledgements
We thank the Genomics of AML Program Project Grant team at Washington University Medical School (US National Cancer Institute PO1 CA101937; principal investigagor, T.J.L.) and the 1,000 Genomes Consortium members for providing the data. We thank members of the 1,000 Genomes structural variation group and H. Li for methodology discussions; D. Bentley and M. Ross (Illumina), C. Alkan and J. Kidd (University of Washington), and Y. Li and H. Zheng (Beijing Genome Institute) for providing validation data; and A. Chinwalla, D. Dooling, S. Smith, J. Eldred, C. Harris, L. Cook, V. Magrini, Y. Tang, H. Schmidt, C. Haipek, G. Elliott and R. Abbott for assistance. This work was supported by the National Human Genome Research Institute (HG003079; principal investigator, R.K.W.).
Author information
Authors and Affiliations
Contributions
E.R.M., R.K.W., L.D. and T.J.L.: project conception and oversight. K.C.: algorithm design and implementation. J.W.W.: variant assembly. J.M.K., M.D.M. and R.S.F.: experimental validation. C.S.P. and L.D.: primer design. S.D.M. and D.P.L.: Illumina library preparation. Q.Z. and M.C.W.: statistical insight. J.W.W., D.E.L., X.S., and D.P.L.: variant characterization and visualization. K.C., E.R.M., M.C.W., L.D. and J.W.W.: manuscript preparation.
Corresponding author
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–12, Supplementary Tables 2, 4–8 and Supplementary Note (PDF 1386 kb)
Supplementary Table 1
List of structural variants detected in simulation. (XLS 72 kb)
Supplementary Table 3
A list of AML2 structural variants detected by BreakDancer, refined by local assembly and validated via PCR resequencing. (XLS 99 kb)
Supplementary Software
The BreakDancer software package encompasses two algorithms: BreakDancerMax detects large structural variants (deletions, insertions, inversions, and intra- and interchromosomal translocations), and BreakDancerMini detects small (10–100 bp) insertions and deletions. (ZIP 16 kb)
Rights and permissions
About this article
Cite this article
Chen, K., Wallis, J., McLellan, M. et al. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6, 677–681 (2009). https://doi.org/10.1038/nmeth.1363
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.1363
This article is cited by
-
Analysis of Preimplantation and Clinical Outcomes of Two Cases by Oxford Nanopore Sequencing
Reproductive Sciences (2024)
-
Premature ovarian insufficiency is associated with global alterations in the regulatory landscape and gene expression in balanced X-autosome translocations
Epigenetics & Chromatin (2023)
-
Whole-genome resequencing and transcriptome analyses of four generation mutants to reveal spur-type and skin-color related genes in apple (Malus domestica Borkh. Cv. Red delicious)
BMC Plant Biology (2023)
-
Test development, optimization and validation of a WGS pipeline for genetic disorders
BMC Medical Genomics (2023)
-
Mastering DNA chromatogram analysis in Sanger sequencing for reliable clinical analysis
Journal of Genetic Engineering and Biotechnology (2023)