This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
A Novel Biallelic LCK Variant Resulting in Profound T-Cell Immune Deficiency and Review of the Literature
Journal of Clinical Immunology Open Access 15 December 2023
-
Identifying high-impact variants and genes in exomes of Ashkenazi Jewish inflammatory bowel disease patients
Nature Communications Open Access 20 April 2023
-
Rare predicted loss-of-function variants of type I IFN immunity genes are associated with life-threatening COVID-19
Genome Medicine Open Access 05 April 2023
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Adzhubei, I.A. et al. Nat. Methods 7, 248–249 (2010).
Kumar, P., Henikoff, S. & Ng, P.C. Nat. Protoc. 4, 1073–1081 (2009).
Kircher, M. et al. Nat. Genet. 46, 310–315 (2014).
Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S. & Goldstein, D.B. PLoS Genet. 9, e1003709 (2013).
Samocha, K.E. et al. Nat. Genet. 46, 944–950 (2014).
Itan, Y. et al. Proc. Natl. Acad. Sci. USA 112, 13615–13620 (2015).
Stenson, P.D. et al. Hum. Genet. 133, 1–9 (2014).
Landrum, M.J. et al. Nucleic Acids Res. 42, D980–D985 (2014).
Auton, A. et al. Nature 526, 68–74 (2015).
Acknowledgements
We thank M. Kircher for information about the CADD method and D.B. Goldstein for gene-level metrics insights. We thank Y. Nemirovskaya, E. Anderson, M. Woollett and D. Papandrea for administrative support. Y.I. was supported in part by grant no. UL1 TR000043 from the National Center for Advancing Translational Sciences (NCATS), US National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program. This study was supported by the Rockefeller University and the St. Giles Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
D.N.C. and P.D.S. are in receipt of funding from Qiagen through a license agreement with Cardiff University. The other authors declare that they have no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Comparison of the performance of impact prediction methods and mutation signatures, on the basis of functional evidence.
(A) ROC curves comparing performance of CADD with PolyPhen-2 and SIFT to distinguish between true positive disease-causing missense mutations extracted from HGMD and false-positive neutral private missense variants derived from patients' WES data from which the known disease-causing mutation was removed. (B) Association of disease-associated deleterious mutation allele frequencies (MAF) with predicted impact scores. Plot of 129,586 HGMD true-positive deleterious mutations, against their minor allele frequencies, in slots of MAF = 0, MAF ≤ 0.001 and MAF ≤ 0.01.
Supplementary Figure 2 Density plots of all 127,109 known HGMD disease-associated deleterious mutations and their corresponding 180,305 alleles by different variant-level software.
(A) CADD scores of all disease-associated deleterious mutations. (B) CADD scores of all disease-associated deleterious alleles. (C) PolyPhen-2 scores of all disease-associated deleterious mutations. (D) PolyPhen-2 scores of all disease-associated deleterious alleles. (E) SIFT scores of all disease-associated deleterious mutations. (F) SIFT scores of all disease-associated deleterious alleles.
Supplementary Figure 3 Density plots of 100,000 bootstrapping simulations to estimate TP and TN prediction rates with CADD-based MSC, by randomly partitioning 1,283 genes that contain at least 10 mutations in both HGMD and in the 1,000 Genomes Project database.
(A) TP prediction rate of novel disease-associated deleterious mutations by 90% CI MSC. (B) TN prediction rate of novel disease-associated deleterious mutations by 90% CI MSC. (C) TP prediction rate of novel disease-associated deleterious mutations by 95% CI MSC. (D) TN prediction rate of novel disease-associated deleterious mutations by 95% CI MSC. (E) TP prediction rate of novel disease-associated deleterious mutations by 99% CI MSC. (F) TN prediction rate of novel disease-associated deleterious mutations by 99% CI MSC.
Supplementary Figure 4 Characteristics of CADD-based 95% CI MSC scores generated from HGMD disease-associated mutations.
(A) Density plot of the MSC scores of all 19,698 human protein-coding genes, with a genome-wide MSC median=10.60. (B) An inverse exponential correlation between MSC and gene damage level measured by the gene damage index (GDI), showing that low MSC genes tend to be highly damaged whereas high MSC genes tend to be only slightly damaged. (C) An exponential correlation between MSC and purifying selection level measured by the neutrality index (NI), showing that high MSC genes tend to be under stronger purifying selection. (D) KEGG pathways functional enrichment of 985 genes with low MSC. The upper panel shows enrichment in the complement and coagulation cascades pathway; the lower panel shows enrichment in the ECM-receptor interaction pathway. (D) 2,288 high MSC genes display a functional enrichment in the Ribosome pathway.
Supplementary Figure 5 ROC curves comparing the performance of MSC with variant-level methods and the RVIS hot zone approach.
(A) CADD-based MSC generated with 90%, 95% and 99% CIs with CADD prediction (provided by the PolyPhen-2 method, based on a fixed cutoff), as well as the RVIS hot zone approach combining RVIS and PolyPhen-2 fixed cutoffs. (B) PolyPhen-2-based MSC generated with 90%, 95% and 99% CIs with PolyPhen-2 prediction (provided by the PolyPhen-2 method, based on a fixed cutoff), as well as the RVIS hot zone approach combining RVIS and PolyPhen-2 fixed cutoffs. (C) SIFT-based MSC generated with 90%, 95% and 99% CIs with SIFT prediction (provided by the SIFT method, based on a fixed cutoff). See Supplementary Methods for a full description of the TP and FP sets used.
Supplementary Figure 6 True positive and true negative prediction rates of variant-level methods and MSC, estimated by a set of 4,152 recently acquired HGMD disease-associated deleterious alleles of 1,119 missense mutations.
(A) True positive and true negative (private non-disease-causing variants of patients) prediction rates by CADD, PolyPhen-2 and SIFT, using fixed cutoffs, hot zone approach (combining RVIS and PolyPhen-2), and MSC estimates with 90%, 95% and 99% CIs generated by the HGMD mutation database. (B) True positive (new deleterious mutations that were not used to generate the ClinVar-based MSC scores) and true negative (private nondisease-causing variants of patients) prediction rates by CADD, PolyPhen-2 and SIFT, using fixed cutoffs, hot zone approach (combining RVIS and PolyPhen-2), and MSC estimates with 90%, 95% and 99% CIs generated by the ClinVar mutation database.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–6, Supplementary Note and Supplementary Methods (PDF 1049 kb)
Supplementary Table 1
A summary of the CADD-based 99% CI MSC protein-coding human genes. (XLSX 1409 kb)
Supplementary Table 2
A summary of the CADD-based 95% CI MSC protein-coding human genes. (XLSX 1425 kb)
Supplementary Table 3
A summary of the CADD-based 90% CI MSC protein-coding human genes. (XLSX 1437 kb)
Supplementary Table 4
A summary of the PolyPhen-2-based 99% CI MSC protein-coding human genes. (XLSX 1689 kb)
Supplementary Table 5
A summary of the PolyPhen-2-based 95% CI MSC protein-coding human genes. (XLSX 1700 kb)
Supplementary Table 6
A summary of the PolyPhen-2-based 90% CI MSC protein-coding human genes. (XLSX 1710 kb)
Supplementary Table 7
A summary of the SIFT-based 99% CI MSC protein-coding human genes. (XLSX 1696 kb)
Supplementary Table 8
A summary of the SIFT-based 95% CI MSC protein-coding human genes. (XLSX 1777 kb)
Supplementary Table 9
A summary of the SIFT-based 90% CI MSC protein-coding human genes. (XLSX 1808 kb)
Supplementary Table 10
KEGG pathway categories displaying high levels of enrichment among genes with low and high MSC scores. (XLSX 40 kb)
Supplementary Table 11
True positive and true negative prediction rates, HGMD-based. (XLSX 45 kb)
Supplementary Table 12
True positive and true negative prediction rates, ClinVar-based. (XLSX 45 kb)
Rights and permissions
About this article
Cite this article
Itan, Y., Shang, L., Boisson, B. et al. The mutation significance cutoff: gene-level thresholds for variant predictions. Nat Methods 13, 109–110 (2016). https://doi.org/10.1038/nmeth.3739
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.3739
This article is cited by
-
A Novel Biallelic LCK Variant Resulting in Profound T-Cell Immune Deficiency and Review of the Literature
Journal of Clinical Immunology (2024)
-
Management of Atopy with Dupilumab and Omalizumab in CADINS Disease
Journal of Clinical Immunology (2024)
-
Identifying shared genetic factors underlying epilepsy and congenital heart disease in Europeans
Human Genetics (2023)
-
Inherited IRAK-4 Deficiency in Acute Human Herpesvirus-6 Encephalitis
Journal of Clinical Immunology (2023)
-
Fulminant Viral Hepatitis in Two Siblings with Inherited IL-10RB Deficiency
Journal of Clinical Immunology (2023)