In a Review in this journal (Alternative cleavage and polyadenylation in health and disease. Nat. Rev. Genet. 20, 599–614 (2019))1, Gruber and Zavolan summarized recent research on alternative polyadenylation (APA). Under the presumption that APA is beneficial, the authors presented cases where APA produces mRNAs of different functions or regulations from the same gene and discussed the APA defects that can cause disease. However, genome-scale evidence suggests that APA arises largely from molecular error and is generally non-adaptive2, which runs counter to the notion that APA has a positive functional impact.

Support for the notion that APA enhances useful function or regulation is available only anecdotally1,3,4,5. Genomic studies have failed to detect a clear relationship between APA and mRNA stability and translatability6 or protein concentration7. These and other observations prompted the proposal that most genes have only one optimal polyadenylation site, which is typically the most frequently used site (hereafter referred to as the major site), and that APA occurs largely owing to polyadenylation error2. This error hypothesis predicts a number of global patterns of APA that have been verified in multiple tissues from five mammals2. Below, we estimate the fraction of deleterious APA that occurs in humans by an established method8,9,10.

Given a polyadenylation error rate, more molecules of non-optimal mRNA are produced from more highly expressed genes. This difference creates stronger selection against polyadenylation error in more highly expressed genes, causing the extent of APA to decrease with gene expression level2. Under the assumption that polyadenylation error has not been selectively removed at all in the least expressed genes but has been completely removed in the most expressed genes, a conservative estimate of the fraction of deleterious polyadenylation can be obtained by 1 minus the extent of APA that occurs in the 20 most expressed genes relative to that in the 20 least expressed genes. This fraction turns out to be ~93% (based on figure 1A in ref.2), potentially explaining why beneficial APA is not observed at the genomic scale. Note that a genome-wide difference in the length of 3′ untranslated regions among tissues or developmental stages is not evidence for adaptation because such a difference could result from expression changes of a small number of trans-factors1.

The mechanisms by which alterations of APA have an impact on health and disease differ under the adaptive and error hypotheses. A disruption of polyadenylation at the major site is expected to be harmful under both hypotheses. By contrast, increasing polyadenylation at an existing non-major site or creating a new polyadenylation site is not expected to be pathogenic under the adaptive hypothesis, but will probably be so under the error hypothesis. Thus, the finding that gaining a proximal polyadenylation site in IRF5 causes systemic lupus erythematosus11 is more readily explained by the error hypothesis than the adaptive hypothesis.

In summary, the error hypothesis provides a different perspective on the origin, biological significance and disease relevance of APA than the adaptive view held by Gruber and Zavolan1. Considering both perspectives will likely be more productive in APA research than holding only one view, especially when this view fails the test of genomics.

There is a reply to this letter by Gruber, A. J. & Zavolan, M. Nat. Rev. Genet. https://doi.org/10.1038/s41576-019-0199-y (2019).