The discovery of peptides encoded by what were thought to be non-coding – or 'junk' – regions of precursors to microRNA sequences reveals a new layer of gene regulation. These sequences may not be junk, after all. See Letter p.90
In plants and animals, microRNAs regulate the expression of many different genes1. Such regulation is crucial in a variety of processes, including transitions through developmental stages and responses to environmental stresses. MicroRNAs (miRNAs) are short in sequence and are generated by enzymatic excision from precursor transcripts called primary miRNAs (pri-miRs), which until now had been assumed not to encode any proteins. But on page 90 of this issue, Lauressergues et al.2 provide convincing evidence to the contrary. They find that some pri-miRs encode peptides that enhance production of their miRNAs. This is the first report of a functional peptide being encoded by a pri-miR and provides a fresh perspective on the significance of pri-miR regions beyond those that directly give rise to miRNAs.
In the 1970s, as it started to become clear that the genomic regions that encode proteins (the genes) swim in a sea of non-protein-coding sequences, the idea of meaningless, or 'junk', DNA became a hot topic of discussion. Biologists are now well aware of introns, the sequences within genes that separate the coding regions (exons) and which are spliced out at the messenger-RNA level, as well as their notable regulatory roles. However, the term junk DNA has survived and is used loosely to describe genomic sequences between genes, giving them an implied lack of importance.
The debate about the usefulness of non-protein-coding DNA sequences continues to rage3,4. However, within these intergenic regions of a genome are the sequences that produce most plant and many animal pri-miRs. Clearly, these sequences are not useless. Yet the regions of a pri-miR that do not generate the miRNA or the highly structured adjacent sequences have suffered the similar fate of being largely ignored and possibly thought of as junk RNA lacking function.
Both plant and animal pri-miRs are transcribed from DNA in the nucleus by the enzyme RNA polymerase II (Fig. 1). The structured (fold-back) region of the transcript surrounding the miRNA sequence is recognized and processed by one of two enzymes — Drosha or Dicer-like1. (In animals, Drosha extracts a short hairpin-like RNA known as pre-miR, which contains the miRNA sequence5. In plants, Dicer-like1 cuts out the miRNA in a duplex form6.) Next, transporter proteins export the excised sequences to the cytoplasm, where they are further processed before becoming competent to guide the RNA-induced silencing complex (RISC) in repressing target genes through either cleavage or translational repression of their mRNAs.
It is generally thought that the sequences of a pri-miR upstream and downstream of the foldback region are rapidly degraded after excision of the embedded miRNA. However, the initial pri-miR has the same characteristics as any mRNA produced by RNA polymerase II (specifically, alteration of its 5′ end by a modification called capping, and addition of polyadenyl groups to its 3′ end). It is therefore equipped with the signals for nuclear export, and for stability and translation in the cytoplasm. Nonetheless, the fate of any full-length pri-miR that escapes processing into an miRNA — in maize (corn), for example, such pri-miR sequences can range from 250 to 2,500 nucleotides long7 — and its capacity to encode a peptide have been largely unnoticed or ignored.
Lauressergues and colleagues identified short open reading frames (ORFs) —sequences that can potentially encode proteins — in many different pri-miRs of two plant species. For five of them, they predicted the corresponding amino-acid sequences of the ORFs, synthesized the corresponding peptides and made specific antibodies against them. Using these antibodies, the authors showed that the ORFs are naturally translated in plants into peptides that they call miPEPs.
In the cases examined, the miPEPs had the same tissue distribution as their associated mature miRNAs and enhanced the expression and effectiveness of these miRNAs. Moreover, the miPEPs promoted the transcription of their corresponding pri-miR, rather than enhancing miRNA stability. This discovery reveals an unexpected function for at least part of the non-foldback pri-miR sequences and highlights yet another layer of gene regulation. It also raises questions about the existence and functions of other peptides potentially encoded by such short ORFs.
Genomic sequences with the potential to encode pri-miRs are constantly evolving in plants. They seem to arise from inverted duplications of whole or fragmented genes that lead to the production of hairpin-like RNAs8. If such RNAs produce useful miRNAs for gene regulation, they are refined into pri-miRs; if not, they erode away. This has led to the concept of ancient and recent miRNAs. Ancient miRNAs have sequences and functions that are conserved across many species, have survived for hundreds of millions of years, and seem destined to be essential for future plant evolution. Recent miRNAs are more species-specific and have much less assured functions and futures.
The miPEPs discovered in the present paper are associated with several families of miRNAs. If we put miR165 into the miR166 family (the two miRNAs differ by only one nucleotide), all seven of the miPEPs discovered in the present paper are associated with ancient miRNA families that are conserved across all flowering plants. Thus, they have all had the evolutionary time to create ORFs encoding functionally useful peptides. From this, it seems likely that yet-to-be-discovered miPEPs will be more prevalent in ancient miRNA families and that miPEPs in younger miRNA families may be detectably co-evolving with their associated miRNAs. It also seems possible that miPEPs are encoded in some animal pri-miRs.
The identification of further miPEPs, using bioinformatics alone, might not be easy. Five of the seven miPEPs identified by Lauressergues et al. are encoded in ORFs of fewer than 100 nucleotides. Sequences encoding potential peptides from ORFs of this size are often ignored or filtered out by automated genome-annotation programs, because the probability of their occurring by chance alone increases exponentially as they get shorter.
Short yet functional peptide-encoding ORFs are also beginning to be discovered upstream of larger conventional protein-coding ORFs9, and many of these defy convention by having unusual start codons (sequences that initiate protein synthesis)10. The experimental discovery of miPEPs and other small peptides such as these raises an inconvenient question: are we missing a vast library of biologically important peptide signals because our bioinformatic analyses are not yet well enough designed to detect them?Footnote 1
Ameres, S. L. & Zamore, P. D. Nature Rev. Mol. Cell Biol. 14, 475–488 (2013).
Lauressergues, D. et al. Nature 520, 90–93 (2015).
Birney, E. et al. Nature 489, 57–74 (2012).
Kellis, M. et al. Proc. Natl Acad. Sci. USA 111, 6131–6138 (2014).
Carthew, R. W. & Sontheimer, E. J. Cell 136, 642–655 (2009).
Fang, Y. & Spector, D. L. Curr. Biol. 17, 818–823 (2007).
Zhang, L. et al. PLoS Genet. 5, e1000716 (2009).
Cuperus, J. T., Fahlgren, N. & Carrington, J. C. Plant Cell 23, 431–442 (2011).
Andrews, S. J. & Rothnagel, J. A. Nature Rev. Genet. 15, 193–204 (2014).
Laing, W. A. et al. Plant Cell http://dx.doi.org/10.1105/tpc.114.133777 (2015).
About this article
Scientia Horticulturae (2020)
Molecular Cancer (2020)
Signal Transduction and Targeted Therapy (2019)
Molecular Therapy (2019)