Detection of copy-number variations from NGS data using read depth information: a diagnostic performance evaluation

Abstract

The detection of copy-number variations (CNVs) from NGS data is underexploited as chip-based or targeted techniques are still commonly used. We assessed the performances of a workflow centered on CANOES, a bioinformatics tool based on read depth information. We applied our workflow to gene panel (GP) and whole-exome sequencing (WES) data, and compared CNV calls to quantitative multiplex PCR of short fluorescent fragments (QMSPF) or array comparative genomic hybridization (aCGH) results. From GP data of 3776 samples, we reached an overall positive predictive value (PPV) of 87.8%. This dataset included a complete comprehensive QMPSF comparison of four genes (60 exons) on which we obtained 100% sensitivity and specificity. From WES data, we first compared 137 samples with aCGH and filtered comparable events (exonic CNVs encompassing enough aCGH probes) and obtained an 87.25% sensitivity. The overall PPV was 86.4% following the targeted confirmation of candidate CNVs from 1056 additional WES. In addition, our CANOES-centered workflow on WES data allowed the detection of CNVs with a resolution of single exons, allowing the detection of CNVs that were missed by aCGH. Overall, switching to an NGS-only approach should be cost-effective as it allows a reduction in overall costs together with likely stable diagnostic yields. Our bioinformatics pipeline is available at: https://gitlab.bioinfo-diag.fr/nc4gpm/canoes-centered-workflow.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Principles of depth of coverage (DOC) comparison.
Fig. 2: CANOES-centered workflow.
Fig. 3: Example of a CNV detected by aCGH but missed by the CANOES-centered workflow.
Fig. 4: Example of CNVs detected by the CANOES-centered workflow from WES data but missed by aCGH.

References

  1. 1.

    Itsara A, Wu H, Smith JD, Nickerson DA, Romieu I, London SJ, et al. De novo rates and selection of large copy number variation. Genome Res. 2010;20:1469–81.

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Huguet G, Schramm C, Douard E, Jiang L, Labbe A, Tihy F, et al. Measuring and estimating the effect sizes of copy number variants on general intelligence in community-based samples. JAMA Psychiatry. 2018;75:447–57.

    PubMed  PubMed Central  Google Scholar 

  3. 3.

    Tan R, Wang Y, Kleinstein SE, Liu Y, Zhu X, Guo H, et al. An evaluation of copy number variation detection tools from whole-exome sequencing data. Hum Mutat. 2014;35:899–907.

    CAS  PubMed  Google Scholar 

  4. 4.

    Samarakoon PS, Sorte HS, Kristiansen BE, Skodje T, Sheng Y, Tjønnfjord GE, et al. Identification of copy number variants from exome sequence data. BMC Genomics. 2014;15:661.

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Roca I, González-Castro L, Fernández H, Couce ML, Fernández-Marmiesse A. Free-access copy-number variant detection tools for targeted next-generation sequencing data. Mutat Res. 2019;779:114–25.

    CAS  PubMed  Google Scholar 

  6. 6.

    Mahmoud M, Gobet N, Cruz-Dávalos DI, Mounier N, Dessimoz C, Sedlazeck FJ. Structural variant calling: the long and the short of it. Genome Biol. 2019;20:246.

    PubMed  PubMed Central  Google Scholar 

  7. 7.

    Hehir-Kwa JY, Pfundt R, Veltman JA. Exome sequencing and whole genome sequencing for the detection of copy number variation. Expert Rev Mol Diagn. 2015;15:1023–32.

    CAS  PubMed  Google Scholar 

  8. 8.

    Boeva V, Popova T, Bleakley K, Chiche P, Cappo J, Schleiermacher G, et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics. 2012;28:423–5.

    CAS  PubMed  Google Scholar 

  9. 9.

    Krumm N, Sudmant PH, Ko A, O’Roak BJ, Malig M, Coe BP, et al. Copy number variation detection and genotyping from exome sequence data. Genome Res. 2012;22:1525–32.

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Zare F, Dow M, Monteleone N, Hosny A, Nabavi S. An evaluation of copy number variation detection tools for cancer using whole exome sequencing data. BMC Bioinformatics. 2017;18:286.

    PubMed  PubMed Central  Google Scholar 

  11. 11.

    Fowler A, Mahamdallie S, Ruark E, Seal S, Ramsay E, Clarke M, et al. Accurate clinical detection of exon copy number variants in a targeted NGS panel using DECoN. Wellcome Open Res. 2016;1:20.

    PubMed  PubMed Central  Google Scholar 

  12. 12.

    Miyatake S, Koshimizu E, Fujita A, Fukai R, Imagawa E, Ohba C, et al. Detecting copy-number variations in whole-exome sequencing data using the eXome Hidden Markov Model: an ‘exome-first’ approach. J Hum Genet. 2015;60:175–82.

    CAS  PubMed  Google Scholar 

  13. 13.

    Collins RL, Brand H, Karczewski KJ, Zhao X, Alföldi J, Khera AV, et al. An open resource of structural variation for medical and population genetics. Genomics. 2019. https://doi.org/10.1101/578674.

  14. 14.

    Backenroth D, Homsy J, Murillo LR, Glessner J, Lin E, Brueckner M, et al. CANOES: detecting rare copy number variants from whole exome sequencing data. Nucleic Acids Res. 2014;42:e97.

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Kuśmirek W, Szmurło A, Wiewiórka M, Nowak R, Gambin T. Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance. BMC Bioinformatics. 2019;20:266.

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Charbonnier F, Raux G, Wang Q, Drouot N, Cordier F, Limacher JM, et al. Detection of exon deletions and duplications of the mismatch repair genes in hereditary nonpolyposis colorectal cancer families using multiplex polymerase chain reaction of short fluorescent fragments. Cancer Res. 2000;60:2760–3.

    CAS  PubMed  Google Scholar 

  17. 17.

    Baert-Desurmont S, Coutant S, Charbonnier F, Macquere P, Lecoquierre F, Schwartz M, et al. Optimization of the diagnosis of inherited colorectal cancer using NGS and capture of exonic and intronic sequences of panel genes. Eur J Hum Genet EJHG. 2018;26:1597–602.

    CAS  PubMed  Google Scholar 

  18. 18.

    Le Guennec K, Nicolas G, Quenez O, Charbonnier C, Wallon D, Bellenguez C, et al. ABCA7 rare variants and Alzheimer disease risk. Neurology. 2016;86:2134–7.

    PubMed  PubMed Central  Google Scholar 

  19. 19.

    DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–8.

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.

    CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Rovelet-Lecrux A, Deramecourt V, Legallic S, Maurage C-A, Le Ber I, Brice A, et al. Deletion of the progranulin gene in patients with frontotemporal lobar degeneration or Parkinson disease. Neurobiol Dis. 2008;31:41–5.

    CAS  PubMed  Google Scholar 

  23. 23.

    MacDonald JR, Ziman R, Yuen RKC, Feuk L, Scherer SW. The Database of Genomic Variants: a curated collection of structural variation in the human genome. Nucleic Acids Res. 2014;42:D986–92.

    CAS  PubMed  Google Scholar 

  24. 24.

    Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004;32:D493–6.

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Cassinari K, Quenez O, Joly-Hélas G, Beaussire L, Le Meur N, Castelain M, et al. A simple, universal, and cost-efficient digital PCR method for the targeted analysis of copy number variations. Clin Chem. 2019;65:1153–60.

    CAS  PubMed  Google Scholar 

  26. 26.

    Campion D, Pottier C, Nicolas G, Le Guennec K, Rovelet-Lecrux A. Alzheimer disease: modeling an Aβ-centered biological network. Mol Psychiatry. 2016;21:861–71.

    CAS  PubMed  Google Scholar 

  27. 27.

    Benjamini Y, Speed TP. Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res. 2012;40:e72.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Geoffroy V, Herenger Y, Kress A, Stoetzel C, Piton A, Dollfus H, et al. AnnotSV: an integrated tool for structural variations annotation. Berger B, editor. Bioinformatics. 2018. https://doi.org/10.1093/bioinformatics/bty304/4970516.

  30. 30.

    Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017;35:316–9.

    PubMed  Google Scholar 

  31. 31.

    Exome Aggregation Consortium, Ruderfer DM, Hamamsy T, Lek M, Karczewski KJ, Kavanagh D, et al. Patterns of genic intolerance of rare copy number variation in 59,898 human exomes. Nat Genet. 2016;48:1107–11.

    PubMed Central  Google Scholar 

  32. 32.

    Amberger JS, Bocchini CA, Schiettecatte F, Scott AF, Hamosh A. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 2015;43:D789–98.

    PubMed  Google Scholar 

  33. 33.

    Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. Genomics. 2019. https://doi.org/10.1101/531210.

  34. 34.

    Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The human genome browser at UCSC. Genome Res. 2002;12:996–1006.

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Le Guennec K, Quenez O, Nicolas G, Wallon D, Rousseau S, Richard A-C, et al. 17q21.31 duplication causes prominent tau-related dementia with increased MAPT expression. Mol Psychiatry. 2017;22:1119–25.

    PubMed  Google Scholar 

  36. 36.

    Mu W, Li B, Wu S, Chen J, Sain D, Xu D, et al. Detection of structural variation using target captured next-generation sequencing data for genetic diagnostic testing. Genet Med Off J Am Coll Med Genet. 2019;21:1603–10.

    Google Scholar 

  37. 37.

    Fromer M, Moran JL, Chambert K, Banks E, Bergen SE, Ruderfer DM, et al. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet. 2012;91:597–607.

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Di Fiore F, Charbonnier F, Martin C, Frerot S, Olschwang S, Wang Q, et al. Screening for genomic rearrangements of the MMR genes must be included in the routine diagnosis of HNPCC. J Med Genet. 2004;41:18–20.

    PubMed  Google Scholar 

  39. 39.

    Taylor CF, Charlton RS, Burn J, Sheridan E, Taylor GR. Genomic deletions in MSH2 or MLH1 are a frequent cause of hereditary non-polyposis colorectal cancer: identification of novel and recurrent deletions by MLPA. Hum Mutat. 2003;22:428–33.

    CAS  PubMed  Google Scholar 

  40. 40.

    van der Klift H, Wijnen J, Wagner A, Verkuilen P, Tops C, Otway R, et al. Molecular characterization of the spectrum of genomic deletions in the mismatch repair genes MSH2, MLH1, MSH6, and PMS2 responsible for hereditary nonpolyposis colorectal cancer (HNPCC). Genes Chromosomes Cancer. 2005;44:123–38.

    PubMed  Google Scholar 

  41. 41.

    Baker M, Strongosky AJ, Sanchez-Contreras MY, Yang S, Ferguson W, Calne DB, et al. SLC20A2 and THAP1 deletion in familial basal ganglia calcification with dystonia. Neurogenetics. 2014;15:23–30.

    CAS  PubMed  Google Scholar 

  42. 42.

    David S, Ferreira J, Quenez O, Rovelet-Lecrux A, Richard A-C, Vérin M, et al. Identification of partial SLC20A2 deletions in primary brain calcification using whole-exome sequencing. Eur J Hum Genet EJHG. 2016;24:1630–4.

    CAS  PubMed  Google Scholar 

  43. 43.

    Guo X-X, Su H-Z, Zou X-H, Lai L-L, Lu Y-Q, Wang C, et al. Identification of SLC20A2 deletions in patients with primary familial brain calcification. Clin Genet. 2019;96:53–60.

    CAS  PubMed  Google Scholar 

  44. 44.

    Nicolas G, Rovelet-Lecrux A, Pottier C, Martinaud O, Wallon D, Vernier L, et al. PDGFB partial deletion: a new, rare mechanism causing brain calcification with leukoencephalopathy. J Mol Neurosci MN. 2014;53:171–5.

    CAS  PubMed  Google Scholar 

  45. 45.

    Machiela MJ, Zhou W, Caporaso N, Dean M, Gapstur SM, Goldin L, et al. Mosaic chromosome 20q deletions are more frequent in the aging population. Blood Adv. 2017;1:380–5.

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This study received fundings from Clinical Research Hospital Program from the French Ministry of Health (GMAJ, PHRC 2008/067), the JPND PERADES and France Génomique. This study was co-supported by the Centre National de Référence Malades Alzheimer Jeunes (CNR-MAJ), European Union and Région Normandie. Europe gets involved in Normandie with the European Regional Development Fund (ERDF).

Collaborators

FREX Consortium

Principal Investigators: Emmanuelle Génin5, Dominique Campion1,4, Jean-François Dartigues6, Jean-François Deleuze3, Jean-Charles Lambert7, Richard Redon8

Bioinformatics group: Thomas Ludwig5, Benjamin Grenier-Boley7, Sébastien Letort5, Pierre Lindenbaum5, Vincent Meyer3, Olivier Quenez1

Statistical genetics group: Christian Dina8, Céline Bellenguez7, Camille Charbonnier1, Joanna Giemza8

Data collection: Stéphanie Chatel7, Claude Férec5, Hervé Le Marec7, Luc Letenneur6, Gaël Nicolas1, Karen Rouault5

Sequencing: Delphine Bacq3, Anne Boland3, Doris Lechner3

Author information

Affiliations

Authors

Consortia

Corresponding author

Correspondence to Gaël Nicolas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Members of the FREX Consortium are listed below Acknowledgements.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Quenez, O., Cassinari, K., Coutant, S. et al. Detection of copy-number variations from NGS data using read depth information: a diagnostic performance evaluation. Eur J Hum Genet (2020). https://doi.org/10.1038/s41431-020-0672-2

Download citation

Search