Measuring intolerance to mutation in human genetics

Abstract

In numerous applications, from working with animal models to mapping the genetic basis of human disease susceptibility, knowing whether a single disrupting mutation in a gene is likely to be deleterious is useful. With this goal in mind, a number of measures have been developed to identify genes in which protein-truncating variants (PTVs), or other types of mutations, are absent or kept at very low frequency in large population samples—genes that appear ‘intolerant’ to mutation. One measure in particular, the probability of being loss-of-function intolerant (pLI), has been widely adopted. This measure was designed to classify genes into three categories, null, recessive and haploinsufficient, on the basis of the contrast between observed and expected numbers of PTVs. Such population-genetic approaches can be useful in many applications. As we clarify, however, they reflect the strength of selection acting on heterozygotes and not dominance or haploinsufficiency.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: pLI relates to hs, but not h and s separately.
Fig. 2: Properties of pLI.

Data availability

C++ source code for the simulations of PTV counts and accompanying scripts used for plotting and data analysis are available at https://github.com/zfuller5280/MutationIntoleranceSimulations.

References

  1. 1.

    Blake, J. A., Bult, C. J., Eppig, J. T., Kadin, J. A. & Richardson, J. E. The Mouse Genome Database genotypes:phenotypes. Nucleic Acids Res. 37, D712–D719 (2009).

  2. 2.

    Bartha, I., di Iulio, J., Venter, J. C. & Telenti, A. Human gene essentiality. Nat. Rev. Genet. 19, 51–62 (2018).

  3. 3.

    Eilbeck, K., Quinlan, A. & Yandell, M. Settling the score: variant prioritization and Mendelian disease. Nat. Rev. Genet. 18, 599–612 (2017).

  4. 4.

    Huang, N., Lee, I., Marcotte, E. M. & Hurles, M. E. Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 6, e1001154 (2010).

  5. 5.

    Raybould, M. C., Birley, A. J. & Hultén, M. Molecular variation of the human elastin (ELN) gene in a normal human population. Ann. Hum. Genet. 59, 149–161 (1995).

  6. 6.

    Wooster, R. et al. Identification of the breast cancer susceptibility gene BRCA2. Nature 378, 789–792 (1995).

  7. 7.

    Wagenseil, J. E. et al. The importance of elastin to aortic development in mice. Am. J. Physiol. Heart Circ. Physiol. 299, H257–H264 (2010).

  8. 8.

    Roy, R., Chun, J. & Powell, S. N. BRCA1 and BRCA2: different roles in a common pathway of genome protection. Nat. Rev. Cancer 12, 68–78 (2011).

  9. 9.

    Simons, Y. B., Bullaughey, K., Hudson, R. R. & Sella, G. A population genetic interpretation of GWAS findings for human quantitative traits. PLoS Biol. 16, e2002985 (2018).

  10. 10.

    Simmons, M. J. & Crow, J. F. Mutations affecting fitness in Drosophila populations. Annu. Rev. Genet. 11, 49–78 (1977).

  11. 11.

    Keightley, P. D. The distribution of mutation effects on viability in Drosophila melanogaster. Genetics 138, 1315–1322 (1994).

  12. 12.

    Deng, H. W. & Lynch, M. Estimation of deleterious-mutation parameters in natural populations. Genetics 144, 349–360 (1996).

  13. 13.

    Orr, H. A. Fitness and its role in evolutionary genetics. Nat. Rev. Genet. 10, 531–539 (2009).

  14. 14.

    Mukai, T., Chigusa, S. I., Mettler, L. E. & Crow, J. F. Mutation rate and dominance of genes affecting viability in Drosophila melanogaster. Genetics 72, 335–355 (1972).

  15. 15.

    Phadnis, N. & Fry, J. D. Widespread correlations between dominance and homozygous effects of mutations: implications for theories of dominance. Genetics 171, 385–392 (2005).

  16. 16.

    Agrawal, A. F. & Whitlock, M. C. Inferences about the distribution of dominance drawn from yeast gene knockout data. Genetics 187, 553–566 (2011).

  17. 17.

    Williamson, S. H. et al. Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc. Natl Acad. Sci. USA 102, 7882–7887 (2005).

  18. 18.

    Eyre-Walker, A., Woolfit, M. & Phelps, T. The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173, 891–900 (2006).

  19. 19.

    Boyko, A. R. et al. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4, e1000083 (2008).

  20. 20.

    Racimo, F. & Schraiber, J. G. Approximation to the distribution of fitness effects across functional categories in human segregating polymorphisms. PLoS Genet. 10, e1004697 (2014).

  21. 21.

    Kim, B. Y., Huber, C. D. & Lohmueller, K. E. Inference of the distribution of selection coefficients for new nonsynonymous mutations using large samples. Genetics 206, 345–361 (2017).

  22. 22.

    Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).

  23. 23.

    Samocha, K. E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).

  24. 24.

    Steinberg, J., Honti, F., Meader, S. & Webber, C. Haploinsufficiency predictions without study bias. Nucleic Acids Res. 43, e101 (2015).

  25. 25.

    Bartha, I. et al. The characteristics of heterozygous protein truncating variants in the human genome. PLOS Comput. Biol. 11, e1004647 (2015).

  26. 26.

    Fadista, J., Oskolkov, N., Hansson, O. & Groop, L. LoFtool: a gene intolerance score based on loss-of-function variants in 60 706 individuals. Bioinformatics 33, 471–474 (2017).

  27. 27.

    Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

  28. 28.

    Blekhman, R. et al. Natural selection on genes that underlie human disease susceptibility. Curr. Biol. 18, 883–889 (2008).

  29. 29.

    Lelieveld, S. H. et al. Meta-analysis of 2,104 trios provides support for 10 new genes for intellectual disability. Nat. Neurosci. 19, 1194–1196 (2016).

  30. 30.

    Ruderfer, D. M. et al. Patterns of genic intolerance of rare copy number variation in 59,898 human exomes. Nat. Genet. 48, 1107–1111 (2016).

  31. 31.

    Kosmicki, J. A. et al. Refining the role of de novo protein-truncating variants in neurodevelopmental disorders by using population reference samples. Nat. Genet. 49, 504–510 (2017).

  32. 32.

    Skraban, C. M. et al. WDR26 haploinsufficiency causes a recognizable syndrome of intellectual disability, seizures, abnormal gait, and distinctive facial features. Am. J. Hum. Genet. 101, 139–148 (2017).

  33. 33.

    Stankiewicz, P. et al. Haploinsufficiency of the chromatin remodeler BPTF causes syndromic developmental and speech delay, postnatal microcephaly, and dysmorphic features. Am. J. Hum. Genet. 101, 503–515 (2017).

  34. 34.

    Nguyen, H. T. et al. Integrated Bayesian analysis of rare exonic variants to identify risk genes for schizophrenia and neurodevelopmental disorders. Genome Med. 9, 114 (2017).

  35. 35.

    Zarrei, M. et al. De novo and rare inherited copy-number variations in the hemiplegic form of cerebral palsy. Genet. Med. 20, 172–180 (2018).

  36. 36.

    Heyne, H. O. et al. De novo variants in neurodevelopmental disorders with epilepsy. Nat. Genet. 50, 1048–1053 (2018).

  37. 37.

    Zech, M. et al. Haploinsufficiency of KMT2B, encoding the lysine-specific histone methyltransferase 2b, results in early-onset generalized dystonia. Am. J. Hum. Genet. 99, 1377–1387 (2016).

  38. 38.

    Haller, M., Mo, Q., Imamoto, A. & Lamb, D. J. Murine model indicates 22q11.2 signaling adaptor CRKL is a dosage-sensitive regulator of genitourinary development. Proc. Natl Acad. Sci. USA 114, 4981–4986 (2017).

  39. 39.

    Wang, J. et al. MARRVEL: integration of human and model organism genetic resources to facilitate functional annotation of the human genome. Am. J. Hum. Genet. 100, 843–853 (2017).

  40. 40.

    Afzali, B. et al. BACH2 immunodeficiency illustrates an association between super-enhancers and haploinsufficiency. Nat. Immunol. 18, 813–823 (2017).

  41. 41.

    Gosalia, N., Economides, A. N., Dewey, F. E. & Balasubramanian, S. MAPPIN: a method for annotating, predicting pathogenicity and mode of inheritance for nonsynonymous variants. Nucleic Acids Res. 45, 10393–10402 (2017).

  42. 42.

    Haldane, J. B. S. A mathematical theory of natural and artificial selection, part V: selection and mutation. Math. Proc. Camb. Philos. Soc. 23, 838–844 (1927).

  43. 43.

    Haldane, J. B. S. The effect of variation of fitness. Am. Nat. 71, 337–349 (1937).

  44. 44.

    Wright, S. The distribution of gene frequencies in populations. Proc. Natl Acad. Sci. USA 23, 307–320 (1937).

  45. 45.

    Crow, J.F. & Kimura, M. An Introduction to Population Genetics Theory (Harper & Row, 1970).

  46. 46.

    Amorim, C. E. G. et al. The population genetics of human disease: the case of recessive, lethal mutations. PLoS Genet. 13, e1006915 (2017).

  47. 47.

    Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

  48. 48.

    Weghorn, D. et al. Applicability of the mutation-selection balance model to population genetics of heterozygous protein-truncating variants in humans. Preprint at https://www.biorxiv.org/content/10.1101/433961v1 (2018).

  49. 49.

    Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat. Genet. 49, 806–810 (2017).

  50. 50.

    Samocha, K.E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint at https://www.biorxiv.org/content/10.1101/148353v1 (2017).

  51. 51.

    Havrilla, J. M., Pedersen, B. S., Layer, R. M. & Quinlan, A. R. A map of constrained coding regions in the human genome. Nat. Genet. 51, 88–95 (2018).

  52. 52.

    Piel, F. B. et al. Global distribution of the sickle cell gene and geographical confirmation of the malaria hypothesis. Nat. Commun. 1, 104 (2010).

  53. 53.

    Gillespie, J.H. Population Genetics: a Concise Guide (JHU Press, 2004).

  54. 54.

    Clark, A. G. Mutation-selection balance with multiple alleles. Genetica 102-103, 41–47 (1998).

  55. 55.

    Simons, Y. B., Turchin, M. C., Pritchard, J. K. & Sella, G. The deleterious mutation load is insensitive to recent population history. Nat. Genet. 46, 220–224 (2014).

  56. 56.

    Charlesworth, B. & Charlesworth, D. Elements of Evolutionary Genetics (W. H. Freeman, 2010).

Download references

Acknowledgements

We thank A. Chakravarti, G. Coop, M. B. Eisen, M. Hurles, J. K. Pritchard, Y. Shen and members of the laboratories of M. Przeworski and G. Sella for helpful discussions. This work was supported by GM128318 to Z.L.F., GM126787 to J.J.B., GM121372 to M.P. and GM115889 to G.S. We acknowledge computing resources from Columbia University's Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) contract C090171.

Author information

All authors conceived and designed the project. M.P. and G.S. supervised the study. Z.L.F. performed simulations. H.M., J.J.B. and Z.L.F. led the data analysis. All authors wrote the manuscript and approved the final version.

Correspondence to Zachary L. Fuller.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Note

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading