Long non-coding RNAs (lncRNAs) have emerged in recent years as major players in a multitude of pathways across species, but it remains challenging to understand which of them are important and how their functions are performed. Comparative sequence analysis has been instrumental for studying proteins and small RNAs, but the rapid evolution of lncRNAs poses new challenges that demand new approaches. Here, I review the lessons learned so far from genome-wide mapping and comparisons of lncRNAs across different species. I also discuss how comparative analyses can help us to understand lncRNA function and provide practical considerations for examining functional conservation of lncRNA genes.
At a glance
- The landscape of long noncoding RNAs in the human transcriptome. Nat. Genet. 47, 199–208 (2015). et al.
- Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Rep. 11, 1110–1122 (2015).
This study compares features and loci of lncRNAs across various vertebrates and shows rapid lncRNA turnover combined with conservation of expression patterns, and positional conservation without sequence conservation across large evolutionary distances.
- Localization and abundance analysis of human lncRNAs at single-cell and single-molecule resolution. Genome Biol. 16, 20 (2015). et al.
- Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011).
This study provides the first comprehensive RNA-seq-based catalogue of human lncRNAs and characterizes their features.
- lncRNASNP: a database of SNPs in lncRNAs and their potential functions in human and mouse. Nucleic Acids Res. 43, D181–D186 (2015). , , , &
- Long noncoding RNAs and human disease. Trends Cell Biol. 21, 354–361 (2011). &
- Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature 408, 86–89 (2000). et al.
- Beyond secondary structure: primary-sequence determinants license pri-miRNA hairpins for processing. Cell 152, 844–858 (2013). , , &
- MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233 (2009).
- Evolution of microRNA diversity and regulation in animals. Nat. Rev. Genet. 12, 846–860 (2011).
- Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15, 568–573 (1998).
- The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505, 635–640 (2014). et al.
- Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals. Genome Res. 24, 616–628 (2014).
References 12 and 13 are studies that comprehensively compare lncRNA sequence and expression evolution in various tetrapods.
- Evolutionary annotation of conserved long non-coding RNAs in major mammalian species. Sci. China Life Sci. 58, 787–798 (2015). et al.
- lincRNAs: genomics, evolution, and mechanisms. Cell 154, 26–46 (2013). &
- Long non-coding RNA discovery across the genus Anopheles reveals conserved secondary structures within and beyond the Gambiae complex. BMC Genomics 16, 337 (2015). , &
- Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell 24, 4333–4345 (2012). et al.
- Diversity and dynamics of the Drosophila transcriptome. Nature 512, 393–399 (2014). et al.
- Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res. 16, 11–19 (2006). et al.
- Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013). et al.
- Comparison of RNA-seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics 15, 419 (2014). et al.
- TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25, 1105–1111 (2009). , &
- Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010). et al.
- HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015). , &
- StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015). et al.
- Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013). et al.
- Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10, 1185–1191 (2013). et al.
- Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. Biochim. Biophys. Acta 1859, 31–40 (2015). &
- Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data. Genome Biol. 16, 150 (2015). et al.
- Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs. Genome Biol. 17, 19 (2016).
This study demonstrates a new methodology for detailed comparison of lncRNAs expressed in pluripotent stem cells in several species and suggests a classification of lncRNAs into groups based on their evolutionary histories.
- Genome-wide characterization of long intergenic non-coding RNAs (lincRNAs) provides new insight into viral diseases in honey bees Apis cerana and Apis mellifera. BMC Genomics 16, 680 (2015). et al.
- Positionally-conserved but sequence-diverged: identification of long non-coding RNAs in the Brassicaceae and Cleomaceae. BMC Plant Biol. 15, 217 (2015). , , &
- Analysis of non-coding transcriptome in rice and maize uncovers roles of conserved lncRNAs associated with agriculture traits. Plant J. 84, 404–416 (2015). et al.
- GREENC: a Wiki-based database of plant lncRNAs. Nucleic Acids Res. 44, D1161–D1166 (2016). , , , &
- Regulatory RNA at the root of animals: dynamic expression of developmental lincRNAs in the calcisponge Sycon ciliatum. Proc. Biol. Sci. 282, 20151746 (2015). , , , &
- Dynamic and widespread lncRNA expression in a sponge and the origin of animal complexity. Mol. Biol. Evol. 32, 2367–2382 (2015). et al.
- Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009).
This is the first study to use chromatin marks to improve the identification of lncRNAs in mouse and provides a detailed description of a set of lncRNAs that were better conserved than background.
- Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness. Genome Biol. 10, R124 (2009). &
- Conservation and losses of non-coding RNAs in avian genomes. PLoS ONE 10, e0121797 (2015). et al.
- Mutations within lncRNAs are effectively selected against in fruitfly but not in human. Genome Biol. 14, R49 (2013). &
- Genome-wide screening and functional analysis identify a large number of long noncoding RNAs involved in the sexual reproduction of rice. Genome Biol. 15, 512 (2014). et al.
- Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 17, 556–565 (2007). , &
- Mouse transcriptome: neutral evolution of 'non-coding' complementary DNAs. Nature http://dx.doi.org/10.1038/nature03016 (2004). et al.
- Negative correlation between expression level and evolutionary rate of long intergenic noncoding RNAs. Genome Biol. Evol. 3, 1390–1404 (2011). , , , &
- Rapid turnover of long noncoding RNAs and the evolution of gene expression. PLoS Genet. 8, e1002841 (2012).
This study compares in detail lncRNAs that are expressed in the liver in three rodents and reports rapid evolutionary turnover of lncRNAs, even when the same tissue is compared across closely related species.
- Human β cell transcriptome analysis uncovers lncRNAs that are tissue-specific, dynamically regulated, and abnormally expressed in type 2 diabetes. Cell. Metab. 16, 435–448 (2012). et al.
- Evolutionarily conserved long intergenic non-coding RNAs in the eye. Hum. Mol. Genet. 22, 2992–3002 (2013). et al.
- Extensive microRNA-mediated crosstalk between lncRNAs and mRNAs in mouse embryonic stem cells. Genome Res. 25, 655–666 (2015). et al.
- Endogenous microRNA sponges: evidence and controversy. Nat. Rev. Genet. 17, 272–283 (2016). &
- Human long noncoding RNAs are substantially less folded than messenger RNAs. Mol. Biol. Evol. 32, 970–977 (2015). &
- Structural imprints in vivo decode RNA regulatory mechanisms. Nature 519, 486–490 (2015). et al.
- A triple helix stabilizes the 3′ ends of long noncoding RNAs that lack poly(A) tails. Genes Dev. 26, 2392–2407 (2012). et al.
- Tandem stem-loops in roX RNAs act together to mediate X chromosome dosage compensation in Drosophila. Mol. Cell 51, 156–173 (2013). et al.
- Regulation of histone H4 Lys16 acetylation by predicted alternative secondary structures in roX noncoding RNAs. Mol. Cell. Biol. 28, 4952–4962 (2008). , &
- Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science 322, 750–756 (2008). , , , &
- 2D structure of the A region of Xist RNA and its implication for PRC2 association. PLoS Biol. 8, e1000276 (2010). et al.
- RNA duplex map in living cells reveals higher-order transcriptome structure. Cell 165, 1267–1279 (2016). et al.
- Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure. Genome Res. 16, 885–889 (2006). , , , &
- 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res. 17, 1797–1808 (2007). et al.
- De novo prediction of structured RNAs from genomic sequences. Trends Biotechnol. 28, 9–19 (2010). et al.
- 1–12 (Springer, 2010). in Advances in Bioinformatics and Computational Biology (eds Ferreira, C. E. et al.)
- Noncoding RNA NORAD regulates genomic stability by sequestering PUMILIO proteins. Cell 164, 69–80 (2016). et al.
- A conserved abundant cytoplasmic long noncoding RNA modulates repression by Pumilio proteins in human cells. Nat. Commun. 7, 12209 (2016). et al.
- Long noncoding RNAs in C. elegans. Genome Res. 22, 2529–2540 (2012). &
- Widespread purifying selection on RNA structure in mammals. Nucleic Acids Res. 41, 8220–8236 (2013). , , &
- HOTAIR forms an intricate and modular secondary structure. Mol. Cell 58, 353–361 (2015). et al.
- Lack of evidence for conserved secondary structure in long noncoding RNAs. Preprint at http://eddylab.org/publications/RivasEddy16/RivasEddy16-preprint.pdf (2016). , &
- Rapid evolutionary turnover underlies conserved lncRNA-genome interactions. Genes Dev. 30, 191–207 (2016).
This study uses a novel computational approach for the sensitive detection of lncRNA homologues in insects and vertebrates based on a combination of synteny, sequence and structural information, and includes the first comparison of genomic binding sites of lncRNAs across species.
- Conservation of a triple-helix-forming RNA stability element in noncoding and genomic RNAs of diverse viruses. Cell Rep. 2, 26–32 (2012).
This study describes a sensitive approach for using a specific sequence-structure pattern to identify lncRNA homologues among extensively divergent viral genomes.
, , , &
- Gene regulation by the act of long non-coding RNA transcription. BMC Biol. 11, 59 (2013). , , &
- Airn transcriptional overlap, but not its lncRNA products, induces imprinted Igf2r silencing. Science 338, 1469–1472 (2012).
This is the most comprehensive study to date of a lncRNA for which only the act of transcription, and not any particular part of the sequence, is important for function.
- Unexpected selection to retain high GC content and splicing enhancers within exons of multiexonic lncRNA loci. RNA 21, 333–346 (2015). &
- Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution. Cell 147, 1537–1550 (2011). , , , &
- The conservation and signatures of lincRNAs in Marek's disease of chicken. Sci. Rep. 5, 15184 (2015). et al.
- The lncRNA DEANR1 facilitates human endoderm differentiation by activating FOXA2 expression. Cell Rep. 11, 137–148 (2015). , , , &
- The mRNA-like noncoding RNA Gomafu constitutes a novel nuclear domain in a subset of neurons. J. Cell Sci. 120, 2498–2506 (2007). et al.
- Unlinking an lncRNA from its associated cis element. Mol. Cell 62, 104–110 (2008). et al.
- Opposing roles for the lncRNA Haunt and its genomic locus in regulating HOXA gene activation during embryonic stem cell differentiation. Cell Stem Cell 16, 504–516 (2015). et al.
- Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs. Genome Biol. 14, R131 (2013).
This paper describes a classification of currently annotated lncRNAs into two groups (promoter-associated and enhancer-associated) with different features based on the chromatin signatures at their transcription start sites.
- Identification of long non-coding RNAs in insects genomes. Curr. Opin. Insect Sci. 7, 37–44 (2015). &
- Genome-wide discovery and characterization of maize long non-coding RNAs. Genome Biol. 15, R40 (2014). et al.
- Long noncoding RNAs and their proposed functions in fibre development of cotton (Gossypium spp.). New Phytol. 207, 1181–1197 (2015). et al.
- New gene evolution: little did we know. Annu. Rev. Genet. 47, 307–333 (2013). , , &
- Origins, evolution, and phenotypic impact of new genes. Genome Res. 20, 1313–1326 (2010).
- The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).
This article provides a comprehensive description of lncRNA features and subcellular localization based on the Encyclopedia of DNA Elements (ENCODE) project data.
- The Xist RNA gene evolved in eutherians by pseudogenization of a protein-coding gene. Science 312, 1653–1655 (2006).
The paper is the first example of a lncRNA that evolved from a loss of coding potential of an ancestral protein-coding gene.
, , , &
- Origin and evolution of the long non-coding genes in the X-inactivation center. Biochimie 93, 1935–1942 (2011). &
- The impact of retrotransposons on human genome evolution. Nat. Rev. Genet. 10, 691–703 (2009). &
- Transposable elements reveal a stem cell specific class of long noncoding RNAs. Genome Biol. 13, R107 (2012). &
- Transposable elements are major contributors to the origin, diversification, and regulation of vertebrate long noncoding RNAs. PLoS Genet. 9, e1003470 (2013). et al.
- DUX4 binding to retroelements creates promoters that are active in FSHD muscle and testis. PLoS Genet. 9, e1003947 (2013). et al.
- Divergent transcription from active promoters. Science 322, 1849–1851 (2008). et al.
- Dealing with pervasive transcription. Mol. Cell 52, 473–484 (2013). , &
- Divergent transcription: a driving force for new gene origination? Cell 155, 990–996 (2013). &
- Bidirectional promoters as important drivers for the emergence of species-specific transcripts. PLoS ONE 8, e57323 (2013). , &
- Large-scale analysis of the regulatory architecture of the mouse genome with a transposon-associated sensor. Nat. Genet. 43, 379–386 (2011). et al.
- Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 3, 2179–2190 (2013). et al.
- The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs. RNA 20, 959–976 (2014). &
- A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements. PLoS ONE 3, e2521 (2008). et al.
- Long non-coding antisense RNA controls Uchl1 translation through an embedded SINEB2 repeat. Nature 491, 454–457 (2012). et al.
- Alu elements in ANRIL non-coding RNA at chromosome 9p21 modulate atherogenic cell functions through trans-regulation of gene networks. PLoS Genet. 9, e1003588 (2013). et al.
- Function and evolution of local repeats in the Firre locus. Nat. Commun. 7, 11021 (2016). , , &
- Topological organization of multichromosomal regions by the long intergenic noncoding RNA Firre. Nat. Struct. Mol. Biol. 21, 198–206 (2014). et al.
- Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495, 333–338 (2013). et al.
- Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes. Genome Biol. 11, R72 (2010). et al.
- Considerations when investigating lncRNA function in vivo. eLife 3, e03058 (2014).
This paper provides important practical guidelines for choosing methods for perturbing lncRNA functions and interpreting the results.
- Regulation of X-chromosome inactivation in development in mice and humans. Microbiol. Mol. Biol. Rev. 62, 362–378 (1998). &
- MENε/β noncoding RNAs are essential for structural integrity of nuclear paraspeckles. Proc. Natl Acad. Sci. USA 106, 2525–2530 (2009). , , , &
- Functional conservation of the lncRNA NEAT1 in the ancestrally diverged marsupial lineage: evidence for NEAT1 expression and associated paraspeckle assembly during late gestation in the opossum Monodelphis domestica. RNA Biol. http://dx.doi.org/10.1080/15476286.2016.1197482 (2016). , , , &
- CARMEN, a human super enhancer-associated long noncoding RNA controlling cardiac specification, differentiation and homeostasis. J. Mol. Cell Cardiol. 89, 98–112 (2015). et al.
- Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 129, 1311–1323 (2007). et al.
- Structural and functional differences in the long non-coding RNA Hotair in mouse and human. PLoS Genet. 7, e1002071 (2011). &
- Targeted disruption of Hotair leads to homeotic transformation and gene derepression. Cell Rep. 5, 3–12 (2013). et al.
- Braveheart, a long noncoding RNA required for cardiovascular lineage commitment. Cell 152, 570–583 (2013). et al.
- linc-HOXA1 is a noncoding RNA that represses Hoxa1 transcription in cis. Genes Dev. 27, 1260–1271 (2013). & , &
- Activity-dependent human brain coding/noncoding gene regulatory networks. Genetics 192, 1133–1148 (2012). et al.
- The primate-specific noncoding RNA HPAT5 regulates pluripotency during human preimplantation development and nuclear reprogramming. Nat. Genet. 48, 44–52 (2016). et al.
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997). et al.
- Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013). &
- Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012). et al.
- Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011). et al.
- Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell Proteom. 13, 397–406 (2014). et al.
- Isoform diversity and regulation in peripheral and central neurons revealed through RNA-seq. PLoS One 7, e30417 (2012). et al.
- Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010). , , &
- Human pluripotent stem cell-derived neural constructs for predicting neural toxicity. Proc. Natl Acad. Sci. USA 112, 12516–12521 (2015). et al.
- Regulation of the ESC transcriptome by nuclear long noncoding RNAs. Genome Res. 25, 1336–1346 (2015). et al.
- Human X inactivation center induces random X chromosome inactivation in male transgenic mice. Genomics 59, 113–121 (1999). et al.
- Human XIST yeast artificial chromosome transgenes show partial X inactivation center function in mouse embryonic stem cells. Proc. Natl Acad. Sci. USA 96, 6841–6846 (1999). et al.
- Identification of novel long noncoding RNAs underlying vertebrate cardiovascular development. Circulation 131, 1278–1290 (2015). et al.
- A long non-coding RNA, LncMyoD, regulates skeletal muscle differentiation by blocking IMP2-mediated mRNA translation. Dev. Cell 34, 181–191 (2015). et al.
- Arabidopsis noncoding RNA mediates control of photomorphogenesis by red light. Proc. Natl Acad. Sci. USA 111, 10359–10364 (2014). et al.
- Rsx is a metatherian RNA with Xist-like properties in X-chromosome inactivation. Nature 487, 254–258 (2012). et al.
- Reverse genetic screening reveals poor correlation between morpholino-induced and mutant phenotypes in zebrafish. Dev. Cell 32, 97–108 (2015). et al.