This article has been updated

Abstract

We have systematically compared copy number variant (CNV) detection on eleven microarrays to evaluate data quality and CNV calling, reproducibility, concordance across array platforms and laboratory sites, breakpoint accuracy and analysis tool variability. Different analytic tools applied to the same raw data typically yield CNV calls with <50% concordance. Moreover, reproducibility in replicate experiments is <70% for most platforms. Nevertheless, these findings should not preclude detection of large CNVs for clinical diagnostic purposes because large CNVs with poor reproducibility are found primarily in complex genomic regions and would typically be removed by standard clinical data curation. The striking differences between CNV calls from different platforms and analytic tools highlight the importance of careful assessment of experimental design in discovery and association studies and of strict data curation and filtering in diagnostics. The CNV resource presented here allows independent data evaluation and provides a means to benchmark new algorithms.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 29 May 2011

    In the version of this article initially published online, Bhooma Thiruvahindrapuram’s name was misspelled. The error has been corrected for the print, PDF and HTML versions of this article.

Accessions

Gene Expression Omnibus

References

  1. 1.

    et al. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949–951 (2004).

  2. 2.

    et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).

  3. 3.

    et al. Large-scale copy number polymorphism in the human genome. Science 305, 525–528 (2004).

  4. 4.

    et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).

  5. 5.

    , , , & Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome. Cytogenet. Genome Res. 115, 205–214 (2006).

  6. 6.

    et al. Array-based comparative genomic hybridization and copy number variation in cancer research. Cytogenet. Genome Res. 115, 262–272 (2006).

  7. 7.

    et al. Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459, 987–991 (2009).

  8. 8.

    et al. Excessive genomic DNA copy number variation in the Li-Fraumeni cancer predisposition syndrome. Proc. Natl. Acad. Sci. USA 105, 11264–11269 (2008).

  9. 9.

    & Array-based DNA diagnostics: let the revolution begin. Annu. Rev. Med. 59, 113–129 (2008).

  10. 10.

    , & Copy number variations and clinical cytogenetic diagnosis of constitutional disorders. Nat. Genet. 39, S48–S54 (2007).

  11. 11.

    et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).

  12. 12.

    et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat. Genet. 40, 1107–1112 (2008).

  13. 13.

    et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466, 368–372 (2010).

  14. 14.

    Wellcome Trust Case Control Consortium. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464, 713–720 (2010).

  15. 15.

    The DNA microarray market. UBS Investment Research Q-Series (2006).

  16. 16.

    , , & Strategies for the detection of copy number and other structural variants in the human genome. Hum. Genomics 2, 403–414 (2006).

  17. 17.

    et al. Towards a comprehensive structural variation map of an individual human genome. Genome Biol. 11, R52 (2010).

  18. 18.

    et al. Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet. 86, 749–764 (2010).

  19. 19.

    et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat. Genet. 20, 207–211 (1998).

  20. 20.

    et al. Whole genome DNA copy number changes identified by high density oligonucleotide arrays. Hum. Genomics 1, 287–299 (2004).

  21. 21.

    et al. Challenges and standards in integrating surveys of structural variation. Nat. Genet. 39, S7–S15 (2007).

  22. 22.

    The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).

  23. 23.

    Widening the spectrum of human genetic variation. Nat. Genet. 38, 9–11 (2006).

  24. 24.

    et al. Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am. J. Hum. Genet. 79, 275–290 (2006).

  25. 25.

    , , & Copy-number variation in control population cohorts. Hum. Mol. Genet. 16, R168–R173 (2007).

  26. 26.

    , , & Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 21, 3763–3770 (2005).

  27. 27.

    , & Comparing CNV detection methods for SNP arrays. Brief. Funct. Genomics 8, 353–366 (2009).

  28. 28.

    et al. Multiple-laboratory comparison of microarray platforms. Nat. Methods 2, 345–350 (2005).

  29. 29.

    , , & Microarray results: how accurate are they? BMC Bioinformatics 3, 22 (2002).

  30. 30.

    et al. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res. 31, 5676–5684 (2003).

  31. 31.

    et al. Detection of submicroscopic constitutional chromosome aberrations in clinical diagnostics: a validation of the practical performance of different array platforms. Eur. J. Hum. Genet. 16, 786–792 (2008).

  32. 32.

    et al. Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors. BMC Genomics 9, 379 (2008).

  33. 33.

    et al. The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics 10, 588 (2009).

  34. 34.

    et al. Resolving the resolution of array CGH. Genomics 89, 647–653 (2007).

  35. 35.

    et al. A comparison of DNA copy number profiling platforms. Cancer Res. 67, 10173–10180 (2007).

  36. 36.

    et al. Genome-wide copy number profiling on high-density bacterial artificial chromosomes, single-nucleotide polymorphisms, and oligonucleotide microarrays: a platform comparison based on statistical power analysis. DNA Res. 14, 1–11 (2007).

  37. 37.

    et al. Comparison of comparative genomic hybridization technologies across microarray platforms. J. Biomol. Tech. 20, 135–151 (2009).

  38. 38.

    et al. A new look towards BAC-based array CGH through a comprehensive comparison with oligo-based array CGH. BMC Genomics 8, 84 (2007).

  39. 39.

    et al. Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res. 38, e105 (2010).

  40. 40.

    , , , & High resolution discovery and confirmation of copy number variants in 90 Yoruba Nigerians. Genome Biol. 10, R125 (2009).

  41. 41.

    et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).

  42. 42.

    et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat. Genet. 40, 1253–1260 (2008).

  43. 43.

    et al. Mutation spectrum revealed by breakpoint sequencing of human germline CNVs. Nat. Genet. 42, 385–391 (2010).

  44. 44.

    et al. Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nat. Biotechnol. 28, 47–55 (2010).

  45. 45.

    The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  46. 46.

    et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011).

  47. 47.

    et al. Structural variation of chromosomes in autism spectrum disorder. Am. J. Hum. Genet. 82, 477–488 (2008).

  48. 48.

    et al. Strong association of de novo copy number mutations with sporadic schizophrenia. Nat. Genet. 40, 880–885 (2008).

  49. 49.

    et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 11, 733–739 (2010).

Download references

Acknowledgements

We thank J. Rickaby and M. Lee for excellent technical assistance. We thank colleagues at Affymetrix, Agilent, Illumina and NimbleGen, and Biodiscovery for sharing data, sharing software and technical assistance. The Toronto Centre for Applied Genomics at the Hospital for Sick Children is acknowledged for database, technical assistance and bioinformatics support. This work was supported by funding from the Genome Canada/Ontario Genomics Institute, the Canadian Institutes of Health Research (CIHR), the McLaughlin Centre, the Canadian Institute of Advanced Research, the Hospital for Sick Children (SickKids) Foundation, a Broad SPARC Project award to P.K.D. and C.L., US National Institutes of Health (NIH) grant HD055150 to P.K.D., and the Department of Pathology at Brigham and Women's Hospital in Boston and NIH grants HG005209, HG004221 and CA111560 to C.L. N.P.C., D. Rajan, D. Rigler, T.F., S.G. and E.P. are supported by the Wellcome Trust (grant no. WT077008). D.P. is supported by fellowships from the Canadian Institutes of Health Research (no. 213997) and the Netherlands Organization for Scientific Research (Rubicon 825.06.031). X.S. is supported by a T32 Harvard Medical School training grant, and K.N. is supported by a T32 institutional training grant (HD007396). S.W.S. holds the GlaxoSmithKline-CIHR Pathfinder Chair in Genetics and Genomics at the University of Toronto and the Hospital for Sick Children (Canada). L.F. is supported by the Göran Gustafsson Foundation and the Swedish Foundation for Strategic Research.

Author information

Author notes

    • Ji Hyeon Park

    Present address: Department of Obstetrics and Gynecology, Pochon CHA University College of Medicine, Seoul, South Korea.

    • Dalila Pinto
    •  & Katayoon Darvishi

    These authors contributed equally to this work.

Affiliations

  1. The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada.

    • Dalila Pinto
    • , Anath C Lionel
    • , Bhooma Thiruvahindrapuram
    • , Jeffrey R MacDonald
    • , Aparna Prasad
    •  & Stephen W Scherer
  2. Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

    • Katayoon Darvishi
    • , Xinghua Shi
    • , Ryan Mills
    • , Kristin Noonan
    • , Richard S Smith
    • , Ji Hyeon Park
    •  & Charles Lee
  3. Wellcome Trust, Sanger Institute, Hinxton, Cambridge, UK.

    • Diana Rajan
    • , Diane Rigler
    • , Tom Fitzgerald
    • , Susan Gribble
    • , Elena Prigmore
    • , Matthew E Hurles
    •  & Nigel P Carter
  4. Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA.

    • Kristin Noonan
    •  & Patricia K Donahoe
  5. McLaughlin Centre and Department of Molecular Genetics, University of Toronto, Toronto, Canada.

    • Stephen W Scherer
  6. Department of Immunology, Genetics and Pathology, SciLifeLab Uppsala, Rudbeck Laboratory, Uppsala University, Sweden.

    • Lars Feuk

Authors

  1. Search for Dalila Pinto in:

  2. Search for Katayoon Darvishi in:

  3. Search for Xinghua Shi in:

  4. Search for Diana Rajan in:

  5. Search for Diane Rigler in:

  6. Search for Tom Fitzgerald in:

  7. Search for Anath C Lionel in:

  8. Search for Bhooma Thiruvahindrapuram in:

  9. Search for Jeffrey R MacDonald in:

  10. Search for Ryan Mills in:

  11. Search for Aparna Prasad in:

  12. Search for Kristin Noonan in:

  13. Search for Susan Gribble in:

  14. Search for Elena Prigmore in:

  15. Search for Patricia K Donahoe in:

  16. Search for Richard S Smith in:

  17. Search for Ji Hyeon Park in:

  18. Search for Matthew E Hurles in:

  19. Search for Nigel P Carter in:

  20. Search for Charles Lee in:

  21. Search for Stephen W Scherer in:

  22. Search for Lars Feuk in:

Contributions

D.P., C.L., N.P.C., M.E.H., S.W.S. and L.F. conceived and designed the study. D.P. and L.F. coordinated sample distribution, experiments and analysis. K.D. managed the experiments conceived at the Harvard Medical School and performed the Nexus analysis. R.S.S., D. Rajan, D. Rigler, T.F., J.H.P., K.N., S.G. and E.P. performed the experiments. Data analyses were performed by D.P., K.D., R.S.S., D. Rajan, T.F., A.C.L., B.T., J.R.M., R.M., A.P., K.N., X.S., P.K.D. and L.F. All authors participated in discussions of different parts of the study. D.P., C.L., S.W.S. and L.F. wrote the manuscript. All authors read and approved the manuscript.

Competing interests

The authors declare competing interests. Affymetrix, Agilent, Illumina and Nimblegen provided arrays or reagents for use in this study at substantial discount. The Centre for Applied Genomics (TCAG) routinely provides fee-for-service experimentation using products from Affymetrix, Agilent and Illumina, and is a Core Lab for Affymetrix and Illumina. S.W.S. belongs to the Scientific Advisory Board of Combimatrix Diagnostics.

Corresponding author

Correspondence to Lars Feuk.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Methods, Supplementary Tables 1, 2, 4–6, and Supplementary Figs. 1–15

Excel files

  1. 1.

    Supplementary Table 3

    List of all CNVs that passed QC.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.1852

Further reading