Evaluating the molecular diagnostic yield of joint genotyping–based approach for detecting rare germline pathogenic and putative loss-of-function variants



Cohort-based germline variant characterization is the standard approach for pathogenic variant discovery in clinical and research samples. However, the impact of cohort size on the molecular diagnostic yield of joint genotyping is largely unknown.


Head-to-head comparison of the molecular diagnostic yield of joint genotyping in two cohorts of 239 cancer patients in the absence and then in the presence of 100 additional germline exomes.


In 239 testicular cancer patients, 4 (7.4%, 95% confidence interval [CI]: 2.1–17.9) of 54 pathogenic variants in the cancer predisposition and American College of Medical Genetics and Genomics (ACMG) genes were missed by one or both computational runs of joint genotyping. Similarly, 8 (12.1%, 95% CI: 5.4–22.5) of 66 pathogenic variants in these genes were undetected by joint genotyping in another independent cohort of 239 breast cancer patients. An exome-wide analysis of putative loss-of-function (pLOF) variants in the testicular cancer cohort showed that 162 (8.2%, 95% CI: 7.1–9.6) pLOF variants were only detected in one analysis run but not the other, while 433 (22.0%, 95% CI: 20.2–23.9%) pLOF variants were filtered out by both analyses despite having sufficient sequencing coverage.


Our analysis of the standard germline variant detection method highlighted a substantial impact of concurrently analyzing additional genomic data sets on the ability to detect clinically relevant germline pathogenic variants.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Overview of the study design.
Fig. 2: Exome-wide analysis of germline variant discovery in the presence and absence of additional genomics datasets.
Fig. 3: Detection of rare germline pathogenic in cancer patients using GATK-JG.
Fig. 4: Detection of rare germline pLOF variants in cancer patients using GATK-JG.
Fig. 5: Performance of GATK-JG in detecting pathogenic and pLOF variants in 12 clinically oriented phenotype-specific multi-gene panels.

Data and software availability

The raw sequence data for all cohorts utilized in this study can be obtained through dbGaP (https://www.ncbi.nlm.nih.gov/gap) or as described in their original papers (See methods). All software tools used in this study are publicly available.


  1. 1.

    AlDubayan S. H. Leveraging Clinical Tumor-Profiling Programs to Achieve Comprehensive Germline-Inclusive Precision Cancer Medicine. JCO Precision Oncology. 3, 1–3 (2019).

    Article  Google Scholar 

  2. 2.

    Bergin, J. DNA sequencing market: size, trends, share & research report 2023. https://www.bccresearch.com/market-research/biotechnology/dna-sequencing-emerging-tech-applications-report.html (2019).

  3. 3.

    The Genome Analysis Toolkit (GATK) team of the Data Sciences Platform at the Broad Institute. Best practices for variant calling GATK. Github. https://github.com/broadinstitute/gatk-docs.

  4. 4.

    1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature. 526, 68–74 (2015).

    Article  Google Scholar 

  5. 5.

    Sherry, S. T., Ward, M. & Sirotkin, K. dbSNP—database for single nucleotide polymorphisms and other classes of minor genetic variation. Genome Res. 9, 677–679 (1999).

  6. 6.

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    CAS  Article  Google Scholar 

  7. 7.

    Bohannan, Z. S. & Mitrofanova, A. Calling variants in the clinic: informed variant calling decisions based on biological, clinical, and laboratory variables. Comput. Struct. Biotechnol. J. 17, 561–569 (2019).

    Article  Google Scholar 

  8. 8.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 536, 285–291 (2016).

    CAS  Article  Google Scholar 

  9. 9.

    Tennessen, J. A. et al. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science. 337, 64–69 (2012).

    CAS  Article  Google Scholar 

  10. 10.

    Van der Auwera, G. A. Calling variants on cohorts of samples using the HaplotypeCaller in GVCF mode. https://gatkforums.broadinstitute.org/gatk/discussion/3893/calling-variants-on-cohorts-of-samples-using-the-haplotypecaller-in-gvcf-mode (2014).

  11. 11.

    Van der Auwera, G. A. et al. From FastQ data to high-confidence variant calls: the Genome Analysis Toolkit Best Practices Pipeline. Curr. Protoc. Bioinformatics. 43, 11.10.1–11.10.33 (2013). https://doi.org/10.1002/0471250953.bi1110s43

  12. 12.

    Taylor-Weiner, A. et al. Genomic evolution and chemoresistance in germ-cell tumours. Nature. 540, 114–118 (2016).

    CAS  Article  Google Scholar 

  13. 13.

    AlDubayan, S. H. et al. Association of inherited pathogenic variants in checkpoint kinase 2 (CHEK2) with susceptibility to testicular germ cell tumors. JAMA Oncol. 5, 514–522 (2019).

    Article  Google Scholar 

  14. 14.

    Litchfield, K. et al. Whole-exome sequencing reveals the mutational spectrum of testicular germ cell tumours. Nat. Commun. 6, 5973 (2015).

    CAS  Article  Google Scholar 

  15. 15.

    Mills, R. E. et al. Natural genetic variation caused by small insertions and deletions in the human genome. Genome Res. 21, 830–839 (2011). https://doi.org/10.1101/gr.115907.110

  16. 16.

    Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–423 (2015).https://doi.org/10.1038/gim.2015.30

  17. 17.

    Robinson, J. T., Thorvaldsdóttir, H., Wenger, A. M., Zehir, A. & Mesirov, J. P. Variant review with the Integrative Genomics Viewer. Cancer Res. 77, e31–e34. (2017).

    CAS  Article  Google Scholar 

  18. 18.

    Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    CAS  Article  Google Scholar 

  19. 19.

    Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 581, 434–443 (2020).

    CAS  Article  Google Scholar 

  20. 20.

    Abeliovich, D. et al. The founder mutations 185delAG and 5382insC in BRCA1 and 6174delT in BRCA2 appear in 60% of ovarian cancer and 30% of early-onset breast cancer patients among Ashkenazi women. Am. J. Hum. Genet. 60, 505–514 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    NCBI. VCV000052738.1. https://www.ncbi.nlm.nih.gov/clinvar/variation/52738/ (2019).

  22. 22.

    NCBI. VCV000003196.6. https://www.ncbi.nlm.nih.gov/clinvar/variation/3196/ (2019).

  23. 23.

    Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).

    CAS  Article  Google Scholar 

  24. 24.

    Broad Institute. Train a CNN model for filtering variants. https://software.broadinstitute.org/gatk/documentation/tooldocs/ (2019).

  25. 25.

    AlDubayan, S. H. et al. Detection of Pathogenic Variants With Germline Genetic Testing Using Deep Learning vs Standard Methods in Patients With Prostate Cancer and Melanoma. JAMA 324, 1957–1969 (2020).

    CAS  Article  Google Scholar 

Download references


We thank all individuals who participated in this study. S.H.A. and E.V.A. had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. This work was supported by American Society of Clinical Oncology (ASCO) Conquer Cancer Foundation Career Development Award (CCF CDA) CDA#13167 (S.H.A.), the Prostate Cancer Foundation Young Investigator Award YIA#18YOUN02 (S.H.A), the PCF-V Foundation Challenge Award (E.M.V.), the National Institutes of Health R37CA222574 (E.M.V.), R01 CA227388 (E.M.V.), and King Abdulaziz City for Science and Technology grant 12-MED2226–46 (M.A.). The funding organizations were not responsible for the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. The results published here are in part based upon data generated by the Cancer Genome Atlas managed by the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI). Information about TCGA can be found at http://cancergenome.nih.gov.

Author information




S.Y.C., E.K., B.R., N.M., E.M.V., A.T.W., S.H.A. generated the germline variant callsets and performed genomic analysis of sequencing data. S.H.A. performed germline variant pathogenicity assessment. S.H.A., A.M.A., M.A., A.K.A. performed analysis of clinical characteristics. S.H.A., E.M.V., S.Y.C., wrote the manuscript. S.H.A., S.Y.C. prepared the main and supplementary figures. All authors reviewed and edited the manuscript.

Corresponding authors

Correspondence to Amaro Taylor-Weiner or Saud H. AlDubayan.

Ethics declarations

Ethics declaration

All individuals in this study consented to institutional review board–approved protocols that allowed for comprehensive genetic analysis of germline samples (methods). This study conforms to the Declaration of Helsinki.

Competing interests

E.V.A. has the following disclosures; advisory and/or consulting for Tango Therapeutics, Genome Medical, Invitae, Illumina, and Ervaxx; research support from Novartis and BMS; equity in Tango Therapeutics, Genome Medical, Syapse, Ervaxx, and Microsoft; travel reimbursement from Roche and Genentech; and institutional patents (ERCC2 mutations and chemotherapy response, chromatin mutations and immunotherapy response, and methods for clinical interpretation). The other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Camp, S.Y., Kofman, E., Reardon, B. et al. Evaluating the molecular diagnostic yield of joint genotyping–based approach for detecting rare germline pathogenic and putative loss-of-function variants. Genet Med (2021). https://doi.org/10.1038/s41436-020-01074-w

Download citation


Quick links