Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants

This article has been updated

Abstract

We have systematically compared copy number variant (CNV) detection on eleven microarrays to evaluate data quality and CNV calling, reproducibility, concordance across array platforms and laboratory sites, breakpoint accuracy and analysis tool variability. Different analytic tools applied to the same raw data typically yield CNV calls with <50% concordance. Moreover, reproducibility in replicate experiments is <70% for most platforms. Nevertheless, these findings should not preclude detection of large CNVs for clinical diagnostic purposes because large CNVs with poor reproducibility are found primarily in complex genomic regions and would typically be removed by standard clinical data curation. The striking differences between CNV calls from different platforms and analytic tools highlight the importance of careful assessment of experimental design in discovery and association studies and of strict data curation and filtering in diagnostics. The CNV resource presented here allows independent data evaluation and provides a means to benchmark new algorithms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Purchase on Springer Link

Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Size distribution of CNV calls.
Figure 2: CNV calling reproducibility.
Figure 3: Reproducibility of CNV breakpoint assignments.
Figure 4: CNV breakpoint accuracy.

Similar content being viewed by others

Accession codes

Accessions

Gene Expression Omnibus

Change history

  • 29 May 2011

    In the version of this article initially published online, Bhooma Thiruvahindrapuram’s name was misspelled. The error has been corrected for the print, PDF and HTML versions of this article.

References

  1. Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949–951 (2004).

    Article  CAS  Google Scholar 

  2. Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).

    Article  CAS  Google Scholar 

  3. Sebat, J. et al. Large-scale copy number polymorphism in the human genome. Science 305, 525–528 (2004).

    Article  CAS  Google Scholar 

  4. Tuzun, E. et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).

    Article  CAS  Google Scholar 

  5. Zhang, J., Feuk, L., Duggan, G.E., Khaja, R. & Scherer, S.W. Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome. Cytogenet. Genome Res. 115, 205–214 (2006).

    Article  CAS  Google Scholar 

  6. Cho, E.K. et al. Array-based comparative genomic hybridization and copy number variation in cancer research. Cytogenet. Genome Res. 115, 262–272 (2006).

    Article  CAS  Google Scholar 

  7. Diskin, S.J. et al. Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459, 987–991 (2009).

    Article  CAS  Google Scholar 

  8. Shlien, A. et al. Excessive genomic DNA copy number variation in the Li-Fraumeni cancer predisposition syndrome. Proc. Natl. Acad. Sci. USA 105, 11264–11269 (2008).

    Article  CAS  Google Scholar 

  9. Beaudet, A.L. & Belmont, J.W. Array-based DNA diagnostics: let the revolution begin. Annu. Rev. Med. 59, 113–129 (2008).

    Article  CAS  Google Scholar 

  10. Lee, C., Iafrate, A.J. & Brothman, A.R. Copy number variations and clinical cytogenetic diagnosis of constitutional disorders. Nat. Genet. 39, S48–S54 (2007).

    Article  CAS  Google Scholar 

  11. Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).

    Article  CAS  Google Scholar 

  12. McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat. Genet. 40, 1107–1112 (2008).

    Article  CAS  Google Scholar 

  13. Pinto, D. et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466, 368–372 (2010).

    Article  CAS  Google Scholar 

  14. Wellcome Trust Case Control Consortium. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464, 713–720 (2010).

  15. The DNA microarray market. UBS Investment Research Q-Series (2006).

  16. Carson, A.R., Feuk, L., Mohammed, M. & Scherer, S.W. Strategies for the detection of copy number and other structural variants in the human genome. Hum. Genomics 2, 403–414 (2006).

    Article  CAS  Google Scholar 

  17. Pang, A.W. et al. Towards a comprehensive structural variation map of an individual human genome. Genome Biol. 11, R52 (2010).

    Article  Google Scholar 

  18. Miller, D.T. et al. Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet. 86, 749–764 (2010).

    Article  CAS  Google Scholar 

  19. Pinkel, D. et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 20, 207–211 (1998).

    Article  CAS  Google Scholar 

  20. Huang, J. et al. Whole genome DNA copy number changes identified by high density oligonucleotide arrays. Hum. Genomics 1, 287–299 (2004).

    Article  CAS  Google Scholar 

  21. Scherer, S.W. et al. Challenges and standards in integrating surveys of structural variation. Nat. Genet. 39, S7–S15 (2007).

    Article  CAS  Google Scholar 

  22. The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).

  23. Eichler, E.E. Widening the spectrum of human genetic variation. Nat. Genet. 38, 9–11 (2006).

    Article  CAS  Google Scholar 

  24. Locke, D.P. et al. Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am. J. Hum. Genet. 79, 275–290 (2006).

    Article  CAS  Google Scholar 

  25. Pinto, D., Marshall, C., Feuk, L. & Scherer, S.W. Copy-number variation in control population cohorts. Hum. Mol. Genet. 16, R168–R173 (2007).

    Article  CAS  Google Scholar 

  26. Lai, W.R., Johnson, M.D., Kucherlapati, R. & Park, P.J. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 21, 3763–3770 (2005).

    Article  CAS  Google Scholar 

  27. Winchester, L., Yau, C. & Ragoussis, J. Comparing CNV detection methods for SNP arrays. Brief. Funct. Genomics 8, 353–366 (2009).

    Article  CAS  Google Scholar 

  28. Irizarry, R.A. et al. Multiple-laboratory comparison of microarray platforms. Nat. Methods 2, 345–350 (2005).

    Article  CAS  Google Scholar 

  29. Kothapalli, R., Yoder, S.J., Mane, S. & Loughran, T.P. Jr. Microarray results: how accurate are they? BMC Bioinformatics 3, 22 (2002).

    Article  Google Scholar 

  30. Tan, P.K. et al. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res. 31, 5676–5684 (2003).

    Article  CAS  Google Scholar 

  31. Zhang, Z.F. et al. Detection of submicroscopic constitutional chromosome aberrations in clinical diagnostics: a validation of the practical performance of different array platforms. Eur. J. Hum. Genet. 16, 786–792 (2008).

    Article  CAS  Google Scholar 

  32. Baumbusch, L.O. et al. Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors. BMC Genomics 9, 379 (2008).

    Article  CAS  Google Scholar 

  33. Curtis, C. et al. The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics 10, 588 (2009).

    Article  Google Scholar 

  34. Coe, B.P. et al. Resolving the resolution of array CGH. Genomics 89, 647–653 (2007).

    Article  CAS  Google Scholar 

  35. Greshock, J. et al. A comparison of DNA copy number profiling platforms. Cancer Res. 67, 10173–10180 (2007).

    Article  CAS  Google Scholar 

  36. Hehir-Kwa, J.Y. et al. Genome-wide copy number profiling on high-density bacterial artificial chromosomes, single-nucleotide polymorphisms, and oligonucleotide microarrays: a platform comparison based on statistical power analysis. DNA Res. 14, 1–11 (2007).

    Article  CAS  Google Scholar 

  37. Hester, S.D. et al. Comparison of comparative genomic hybridization technologies across microarray platforms. J. Biomol. Tech. 20, 135–151 (2009).

    PubMed  PubMed Central  Google Scholar 

  38. Wicker, N. et al. A new look towards BAC-based array CGH through a comprehensive comparison with oligo-based array CGH. BMC Genomics 8, 84 (2007).

    Article  Google Scholar 

  39. Dellinger, A.E. et al. Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res. 38, e105 (2010).

    Article  Google Scholar 

  40. Matsuzaki, H., Wang, P.H., Hu, J., Rava, R. & Fu, G.K. High resolution discovery and confirmation of copy number variants in 90 Yoruba Nigerians. Genome Biol. 10, R125 (2009).

    Article  Google Scholar 

  41. Kidd, J.M. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).

    Article  CAS  Google Scholar 

  42. Korn, J.M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008).

    Article  CAS  Google Scholar 

  43. Conrad, D.F. et al. Mutation spectrum revealed by breakpoint sequencing of human germline CNVs. Nat. Genet. 42, 385–391 (2010).

    Article  CAS  Google Scholar 

  44. Lam, H.Y. et al. Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nat. Biotechnol. 28, 47–55 (2010).

    Article  CAS  Google Scholar 

  45. The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  46. Mills, R.E. et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011).

    Article  CAS  Google Scholar 

  47. Marshall, C.R. et al. Structural variation of chromosomes in autism spectrum disorder. Am. J. Hum. Genet. 82, 477–488 (2008).

    Article  CAS  Google Scholar 

  48. Xu, B. et al. Strong association of de novo copy number mutations with sporadic schizophrenia. Nat. Genet. 40, 880–885 (2008).

    Article  CAS  Google Scholar 

  49. Leek, J.T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 11, 733–739 (2010).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank J. Rickaby and M. Lee for excellent technical assistance. We thank colleagues at Affymetrix, Agilent, Illumina and NimbleGen, and Biodiscovery for sharing data, sharing software and technical assistance. The Toronto Centre for Applied Genomics at the Hospital for Sick Children is acknowledged for database, technical assistance and bioinformatics support. This work was supported by funding from the Genome Canada/Ontario Genomics Institute, the Canadian Institutes of Health Research (CIHR), the McLaughlin Centre, the Canadian Institute of Advanced Research, the Hospital for Sick Children (SickKids) Foundation, a Broad SPARC Project award to P.K.D. and C.L., US National Institutes of Health (NIH) grant HD055150 to P.K.D., and the Department of Pathology at Brigham and Women's Hospital in Boston and NIH grants HG005209, HG004221 and CA111560 to C.L. N.P.C., D. Rajan, D. Rigler, T.F., S.G. and E.P. are supported by the Wellcome Trust (grant no. WT077008). D.P. is supported by fellowships from the Canadian Institutes of Health Research (no. 213997) and the Netherlands Organization for Scientific Research (Rubicon 825.06.031). X.S. is supported by a T32 Harvard Medical School training grant, and K.N. is supported by a T32 institutional training grant (HD007396). S.W.S. holds the GlaxoSmithKline-CIHR Pathfinder Chair in Genetics and Genomics at the University of Toronto and the Hospital for Sick Children (Canada). L.F. is supported by the Göran Gustafsson Foundation and the Swedish Foundation for Strategic Research.

Author information

Authors and Affiliations

Authors

Contributions

D.P., C.L., N.P.C., M.E.H., S.W.S. and L.F. conceived and designed the study. D.P. and L.F. coordinated sample distribution, experiments and analysis. K.D. managed the experiments conceived at the Harvard Medical School and performed the Nexus analysis. R.S.S., D. Rajan, D. Rigler, T.F., J.H.P., K.N., S.G. and E.P. performed the experiments. Data analyses were performed by D.P., K.D., R.S.S., D. Rajan, T.F., A.C.L., B.T., J.R.M., R.M., A.P., K.N., X.S., P.K.D. and L.F. All authors participated in discussions of different parts of the study. D.P., C.L., S.W.S. and L.F. wrote the manuscript. All authors read and approved the manuscript.

Corresponding author

Correspondence to Lars Feuk.

Ethics declarations

Competing interests

The authors declare competing interests. Affymetrix, Agilent, Illumina and Nimblegen provided arrays or reagents for use in this study at substantial discount. The Centre for Applied Genomics (TCAG) routinely provides fee-for-service experimentation using products from Affymetrix, Agilent and Illumina, and is a Core Lab for Affymetrix and Illumina. S.W.S. belongs to the Scientific Advisory Board of Combimatrix Diagnostics.

Supplementary information

Supplementary Text and Figures

Supplementary Methods, Supplementary Tables 1, 2, 4–6, and Supplementary Figs. 1–15 (PDF 2872 kb)

Supplementary Table 3

List of all CNVs that passed QC. (XLS 15495 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pinto, D., Darvishi, K., Shi, X. et al. Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29, 512–520 (2011). https://doi.org/10.1038/nbt.1852

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.1852

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing