Selective effects of heterozygous protein-truncating variants

Charlesworth, Brian; Hill, William G.

doi:10.1038/s41588-018-0291-9

Download PDF

Correspondence
Published: 26 November 2018

Selective effects of heterozygous protein-truncating variants

Nature Genetics volume 51, page 2 (2019)Cite this article

2725 Accesses
7 Citations
1 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 12 December 2018

This article has been updated

To the Editor — Cassa et al.¹ have recently presented an interesting analysis of the selection coefficients against heterozygous carriers of protein-truncating variants (PTVs) in several human populations, concluding that the mean selection coefficient against such a mutation when heterozygous is approximately 0.05, with a wide distribution around the mean (p. 809 and Fig. 1 in ref. ¹). With random mating in a large population, selection against the heterozygous carriers of strongly deleterious mutations is the predominant selective force, because homozygotes are very rare. The equilibrium frequency of mutant alleles at a locus, q*, is then equal to u/s_het, where u is the mutation rate of a deleterious allele, and s_het is the decrease in fitness experienced by heterozygous carriers, measured relative to the fitness of normal individuals².

For this purpose, it is reasonable to assume that, for a given gene, u is the net rate of mutation to all possible PTVs that can be generated for the gene in question and that s_het is the same for all the PTV mutations in the gene. If the mutations are sufficiently severe in their fitness effects that they are destined for rapid elimination from the population, the mean frequency of mutant alleles over the probability distribution of q generated by random genetic drift is approximately equal to q* (ref. ³), thus apparently justifying the assumption of mutation-selection equilibrium. In their analysis, Cassa et al. assumed that the observed number of copies of a mutant allele for a given gene in a set of N alleles sampled from a population is drawn from a Poisson distribution with mean Nq*. For this assumption to be valid, the fluctuations in q around q* produced by drift must be negligible. Cassa et al. justified this assumption through a heuristic argument (first section in Methods in ref. ¹).

We believe that this assumption is questionable, as can be seen by considering the probability density of q, ϕ(q), at the stationary state among mutation, selection and drift in a randomly mating population with effective size N_e, first studied by Wright⁴ (formally, the existence of the stationary state requires a small amount of back mutation from mutant to wild type, but this has a trivial effect and can be ignored). Nei³ has shown that ϕ(q) for a strongly selected mutation with a heterozygous selection coefficient s_het is well approximated by a gamma distribution, with a mean of q* and shape parameter θ = 4N_eu. Poisson sampling from a gamma distribution generates a negative binomial distribution⁵ for the number of copies i of a mutant allele in a sample of N alleles:

$$P(i) = \left( \begin{array}{l}i + \theta - 1\\ i\end{array} \right)\left( {\frac{z}{{z + 1}}} \right)^\theta \left( {\frac{1}{{z + 1}}} \right)^i$$

where z = 4N_es_het/N.

The mean and variance of the distribution are Nq* and θ(1 + z)/z², respectively. The ratio of the coefficient of variation of this distribution to that for a Poisson distribution with the same mean is √(1 + z ^–1). It follows that, if z << 1, there is a much wider spread in the sampling distribution of the observed numbers of copies of mutant alleles across different genes than was assumed by Cassa et al. For example, with N = 60,000, s_het = 0.05, and N_e = 10,000 (a frequently used estimate for the species effective population size of humans⁶), the ratio is equal to 5.57.

This result implies that there may be a substantial upward bias in the spread of the distribution of s_het values estimated by the method of Cassa et al. We recognize that it is probably not appropriate to use the above value of N_e = 10,000–20,000 for humans, which is obtained from putatively neutral DNA sequence diversity and reflects the harmonic mean of the species effective population size over several hundred thousand years in the past⁶. Mutations destined for loss persist in a population for only a few tens of generations at most⁷, and so the N_e relevant for PTVs is likely to reflect the much larger population sizes characteristic of the last few hundred years, thus decreasing the size of the bias. For example, with N_e = 100,000, the ratio of the coefficients of variation becomes 2.

In addition, the elimination of strongly deleterious alleles is also affected by population subdivision, and their fate is then strongly determined by the local effective population size, as shown by the classic studies of Dobzhansky and Wright⁸ on the allelism of lethal mutations in populations of Drosophila pseudoobscura. Even for the simple case of an island model with an infinite number of demes, the expression for ϕ(q) becomes more complex than a gamma distribution, and the mean allele frequency can depart substantially from q* (ref. ⁹). A detailed analysis of the effects of drift on the frequency distribution of the numbers of deleterious mutations with the demographies characteristic of the populations used in their study would be needed to determine whether the conclusions reached by Cassa et al. concerning the width of distribution of the heterozygous selection coefficient are valid. In addition, their estimates of the selection coefficients for individual genes were based on the inferred distribution of s_het, thus also prompting questions about their accuracy.

In response to these comments, the authors¹⁰ have conducted an analysis that includes a model of recent population-size change for Europeans. The results appear to substantiate their previous conclusions, notwithstanding the approximations made in their original study.

Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this paper.

Change history

12 December 2018
In the version of this article initially published, reference 10 incorrectly cited Seplyarskiy, V. B. et al. Weghorn, D. et al. is the correct reference. The error has been corrected in the HTML and PDF version of the article.

References

Cassa, C. A. et al. Nat. Genet. 49, 806–810 (2017).
Article CAS Google Scholar
Haldane, J. B. S. Proc. Camb. Philos. Soc. 23, 838–844 (1927).
Article Google Scholar
Nei, M. Proc. Natl. Acad. Sci. USA 60, 517–524 (1968).
Article CAS Google Scholar
Wright, S. Genetics 16, 97–159 (1931).
CAS PubMed PubMed Central Google Scholar
Fisher, R. A. Ann. Eugen. 11, 182–187 (1941).
Article Google Scholar
Charlesworth, B. Nat. Rev. Genet. 10, 195–205 (2009).
Article CAS Google Scholar
Kimura, M. & Ota, T. Genetics 63, 701–709 (1969).
CAS PubMed PubMed Central Google Scholar
Wright, S., Dobzhansky, T. & Hovanitz, W. Genetics 27, 363–394 (1942).
CAS PubMed PubMed Central Google Scholar
Glémin, S. Genet. Res. 86, 41–51 (2005).
Article Google Scholar
Weghorn, D. et al. Preprint at https://www.biorxiv.org/content/early/2018/10/03/433961 (2018).

Download references

Author information

Authors and Affiliations

Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
Brian Charlesworth & William G. Hill

Authors

Brian Charlesworth
View author publications
You can also search for this author in PubMed Google Scholar
William G. Hill
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Brian Charlesworth or William G. Hill.

Ethics declarations

Competing interests

The authors declare no competing interests.

Supplementary Information

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Charlesworth, B., Hill, W.G. Selective effects of heterozygous protein-truncating variants. Nat Genet 51, 2 (2019). https://doi.org/10.1038/s41588-018-0291-9

Download citation

Published: 26 November 2018
Issue Date: January 2019
DOI: https://doi.org/10.1038/s41588-018-0291-9

This article is cited by

CDK12 is a potential biomarker for diagnosis, prognosis and immunomodulation in pan-cancer
- Ke-Qi Lu
- Zuo-Lin Li
- Bin Wang
Scientific Reports (2024)
Predicting functional effect of missense variants using graph attention neural networks
- Haicang Zhang
- Michelle S. Xu
- Yufeng Shen
Nature Machine Intelligence (2022)
Extreme purifying selection against point mutations in the human genome
- Noah Dukler
- Mehreen R. Mughal
- Adam Siepel
Nature Communications (2022)
A fast regression via SVD and marginalization
- Philip Greengard
- Andrew Gelman
- Aki Vehtari
Computational Statistics (2022)
Reply to ‘Selective effects of heterozygous protein-truncating variants’
- Christopher A. Cassa
- Donate Weghorn
- Shamil R. Sunyaev
Nature Genetics (2019)

Selective effects of heterozygous protein-truncating variants

Subjects

Reporting summary

Change history

12 December 2018

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Supplementary Information

Reporting Summary

Rights and permissions

About this article

Cite this article

This article is cited by

CDK12 is a potential biomarker for diagnosis, prognosis and immunomodulation in pan-cancer

Predicting functional effect of missense variants using graph attention neural networks

Extreme purifying selection against point mutations in the human genome

A fast regression via SVD and marginalization

Reply to ‘Selective effects of heterozygous protein-truncating variants’

Estimating the selective effects of heterozygous protein-truncating variants from human exome data

Reply to ‘Selective effects of heterozygous protein-truncating variants’

Search

Quick links

Subjects

Reporting summary

Change history

12 December 2018

References

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links