In the cell, genomic DNA is transcribed into various types of RNA. But not all RNAs are translated into proteins. Does this give protein-coding RNAs greater credibility in terms of function? Views differ.
The topic in brief
Among RNA populations, messenger RNA arguably holds special status.
Its mere transcription from DNA is considered sufficient evidence for its function as a protein-coding sequence.
Whether tens of thousands of non-protein-coding RNAs (ncRNAs) are equally important is debatable.
One argument is that unless a function is discovered for a ncRNA, transcription per se is not enough to suggest that it has a function.
The alternative viewpoint is that if ncRNAs are transcribed, it must be for a reason.
Quantity or quality?
Monika S. Kowalczyk & Douglas R. Higgs
In many species, ncRNAs are abundant and bewilderingly complex. What we wonder is whether they all carry genetic information (as do all mRNAs), or whether some of them are the by-products of abnormal or inconsequential transcription.
Many short ncRNAs, which are often derived from long ncRNAs (lncRNAs), regulate gene expression (Table 1). Moreover, full-length lncRNAs may themselves have biological roles. Take, for example, the RNA product of the XIST gene, an lncRNA that effects inactivation of the X chromosome. Many ncRNAs are transcribed from intergenic regions around genes and their regulatory elements (for example, enhancers and promoters). Some overlap with protein-coding genes in both sense — the direction of transcription — and antisense orientations. Others lie in intergenic regions far from protein-coding genes.
In contrast to genes, however, genomic sequences encoding ncRNAs have often been poorly conserved during evolution. Also, very few natural mutations in ncRNAs have been shown to be the main cause of genetic diseases in humans, and few functionally important mutations in ncRNA-encoding genes have been identified in animal models1. This suggests that, in contrast to many protein-coding genes, individual ncRNAs have a relatively minor role in biological processes.
If some ncRNAs are non-functional, why are they transcribed? Here, it may be that the level of expression is the significant factor. The ever-improving technologies for sequencing the transcriptome2 (the cell's complement of total RNA) can now detect RNAs present at an average of less than one copy per cell. At what stage does such information pass from revealing additional genetic complexity to simply detecting the inevitable by-products of transcription from accessible, activated chromatin (DNA–protein complexes)?
Indeed, the efficiency with which all stages of transcription and RNA processing are performed is intimately related to the physical and chemical state of the associated chromatin. The cell has evolved complex systems to suppress promiscuous transcription from chromatin both within genes and from intergenic regions. Furthermore, specific pathways degrade aberrant or irrelevant transcripts3. When these constraints are removed experimentally (and presumably when naturally modified in vivo), some irrelevant RNA transcripts are likely to be produced from many promoter- and enhancer-like elements that are accessible in chromatin.
The genome contains many more enhancers than it does protein-coding genes, and these determine when and where genes are expressed. Enhancers are widely dispersed throughout the genome. Those that occur between gene sequences produce a variety of RNAs, including lncRNAs, but very few of these transcripts have been shown unequivocally to have a function (Table 1). Enhancers located within genes also produce long transcripts known as multi-exonic enhancer RNAs (meRNAs; Table 1), which resemble mRNA. Nonetheless, these transcripts have a very low coding capacity4, and their role — if any — is unknown.
Although many RNAs emanating from enhancers, promoters and other genomic elements that regulate gene expression may represent inconsequential transcription, transcription per se may be required for establishing or maintaining the activity of these elements and for 'templating' the associated chromatin. If so, information carried in the sequences of these ncRNAs would be largely irrelevant.
The onus is on scientists to unequivocally demonstrate the biological roles of these molecules.
The huge number and complexity of RNAs being documented is certainly of great interest, and it would be surprising if evolution had not selected a proportion of these for their biological function5. However, the onus is on scientists to unequivocally demonstrate the biological roles of these molecules, rather than presuming that they are all functionally relevant components of the transcriptome. For starters, ncRNAs should be accurately classified by fully defining their associated transcriptional units and their patterns of expression during development and cell differentiation, as recently set out6. This, in turn, should direct the challenging experiments required to determine how various ncRNAs act, individually or in groups, to exert their proposed biological effects.
Patience is a virtue
Thomas R. Gingeras
We should not be too sceptical about non-coding RNAs just because we don't know their functions.
An open mind is not an uncritical one, and the obligation to be critical as scientists should not necessarily condition us to look unfavourably on unexpected results. We should therefore not be too sceptical about ncRNAs just because we don't know their functions. But let me start with an outline of where the debate originates.
Within the past decade, reports that the mouse and human genomes were pervasively transcribed (meaning “that the majority of its bases are associated with at least one primary transcript”7) into predominantly ncRNAs8,9 were surprising and resulted in two types of criticism. The first of these centred on whether the detected RNAs are artefacts of the technologies used to identify them. The second objection focused on the biological importance of such transcripts. Unfortunately, these criticisms were often intermingled10, requiring subsequent correction11.
The overwhelming majority of novel ncRNAs have three properties that suggest they should be ignored. The transcripts seem to have greatly reduced protein-coding potential; their expression levels are markedly lower than those of mRNAs; and their expression is mostly cell-type specific. Moreover, genomic sequences encoding these transcripts map to regions that were previously thought to be either untranscribed (sequences in the opposite strands to genes, and sequences between genes) or uninformative (intron sequences within genes, which do not make it into mature mRNA).
The artefact objection has now been addressed by results from many labs showing a wealth of ncRNA expression using a wide range of technologies (tiling arrays, high-throughput RNA sequencing, full-length complementary DNA cloning, northern hybridization and RNase protection). Determining the biological importance of ncRNAs is more challenging and an area of active investigation. Nonetheless, as the efforts to catalogue and characterize such RNAs get under way, the initial atmosphere of scepticism continues to hang over this subject.
Healthy scepticism is an essential element of the scientific process. But it seems curious that ncRNAs have been deemed less interesting than mRNAs simply as a result of the short time since their discovery and the poor understanding of their biological roles. It is perhaps worth recalling that it took almost eight years from the discovery of the first member of microRNAs (lin-4 miRNA) to the elucidation of the function of the very large class of short ncRNAs to which it belongs12. The functions attributed to miRNAs include such fundamental biological processes as control of developmental timing (miR273), organ development (miR84), tissue growth (miR181) and tumorigenesis (miR17).
According to the careful annotation by the GENCODE group13, there are currently some 161,000 human transcripts, 85,323 (53%) of which are ncRNAs14. Although the biological function of most of these ncRNAs is unclear, roughly 2% are precursors to miRNAs. In addition, 10% are lncRNAs that map to intergenic and intronic regions, and many of these transcripts have been implicated in regulation — both locally and from a distance — of developmentally important genes6. Notably, another 16% of the annotated ncRNAs map to pseudogenes — genes that have lost their original functional abilities. And some of these have been shown to regulate gene expression by acting as decoys for miRNAs15.
With the growing identification of functional classes of ncRNA and understanding of the various roles that many of these transcripts have, the original atmosphere of pessimism concerning their biological importance should gradually change to one of cautious interest. The scientific process is not free of bias. But openness to fresh possibilities has the potential to reveal many new ideas.
Mattick, J. S. PLoS Genet. 5, e1000459 (2009).
Mercer, T. R. et al. Nature Biotechnol. 30, 99–104 (2011).
Schmid, M. & Jensen, T. H. Wiley Interdisc. Rev. RNA 1, 474–485 (2010).
Kowalczyk, M. S. et al. Mol. Cell http://dx.doi.org/10.1016/j.molcel.2011.12.021 (2012).
van Bakel, H. & Hughes, T. R. Briefings Funct. Genomics Proteomics 8, 424–436 (2009).
Ulitsky, I., Shkumatava, A., Jan, C. H., Sive, H. & Bartel, D. P. Cell 147, 1537–1550 (2011).
The ENCODE Project Consortium Nature 447, 799–816 (2007).
Okazaki, Y. et al. Nature 420, 563–573 (2002).
Kapranov, P. et al. Science 296, 916–919 (2002).
van Bakel, H., Nislow, C., Blencowe, B. J. & Hughes, T. R. PLoS Biol. 8, e1000371 (2010).
Clark, M. B. et al. PLoS Biol. 9, e1000625 (2011).
Lee, R. C., Feinbaum, R. L. & Ambros, V. Cell 75, 843–854 (1993).
Harrow, J. et al. Genome Biol. 7, Suppl. 1, S4 (2006).
Ebert, M. S. & Sharp, P. A. Curr. Biol. 20, R858–861 (2010).
Kim, T.-K. et al. Nature 465, 182–187 (2010).
See Insight p.321
About this article
Cite this article
Kowalczyk, M., Higgs, D. & Gingeras, T. RNA discrimination. Nature 482, 310–311 (2012). https://doi.org/10.1038/482310a
RNA-Seq with a novel glabrous-ZM24fl reveals some key lncRNAs and the associated targets in fiber initiation of cotton
BMC Plant Biology (2022)
Molecular Biology Reports (2022)
Genome-wide analysis of long non-coding RNAs (lncRNAs) in two contrasting rapeseed (Brassica napus L.) genotypes subjected to drought stress and re-watering
BMC Plant Biology (2020)
Differentially expressed lncRNAs in peripheral blood mononuclear cells from middle-aged female patients with rheumatoid arthritis–associated interstitial lung disease
Clinical Rheumatology (2020)
Dissecting the role of non-coding RNAs in the accumulation of amyloid and tau neuropathologies in Alzheimer’s disease
Molecular Neurodegeneration (2017)