RNA-silencing pathways

Besides well-known ribosomal, messenger, and transfer RNAs, many short and long RNA types are known from the cell cytoplasm. Among short noncoding RNAs (sncRNAs), small interfering RNAs (siRNAs) and microRNAs (miRNAs) play a pivotal role in the regulation of eukaryotic cytoplasmic translation, and involve a DICER-related protein and an Argonaute-related protein (Shabalina and Koonin 2008; Ghildiyal and Zamore 2009; Auyeung et al. 2013; Fang and Bartel 2015; Michlewski and Cáceres 2019). DICER proteins are required to process the immature RNA transcript to its functional form (Bernstein et al. 2001; Bartel 2018), while Argonaute proteins load the mature sncRNA and take part in the repression of the target transcripts (Bartel 2009; O’Brien et al. 2018).

Primary siRNAs are generally produced from exogenous double-stranded RNAs; conversely, primary miRNAs are transcribed from specific genomic loci (for instance, Ghildiyal et al. 2008; O’Brien et al. 2018; and references therein). However, this distinction is blurred since siRNAs have been documented arising from selfish elements integrated into the genome (Yang and Kazazian 2006; Chen et al. 2012), hairpins or endogenous double-stranded RNAs (Czech et al. 2008; Kawamura et al. 2008; Okamura et al. 2008; Tam et al. 2008; Watanabe et al. 2008; Ghildiyal and Zamore 2009). Moreover, siRNAs involve a complete base pairing with the target mRNA, whereas miRNAs may show more flexible complementarity to their targets. This is the case of metazoans, where a short sequence at the 5’ of the mature miRNA, called the “seed”, is crucial in the interaction with mRNAs (Shabalina and Koonin 2008; Ghildiyal and Zamore 2009; Bofill-De Ros et al. 2020).

The ancestral forms of RNAi most likely worked as defense mechanisms against viruses and transposons (Li and Ding 2005; Matzke and Birchler 2005). However, alternative hypotheses have been put forward. RNA-mediated gene silencing and suppression of exogenous or selfish elements may have been an exaptation after the evolution of an RNA machinery used for centromere assembly and proper formation of telomeres during eukaryogenesis (Cavalier-Smith 2010). Alternatively, a qualitative system drift has been proposed for RNAi, starting from the prokaryotic antisense RNA gene regulation mechanism (Torri et al. 2022).

It is commonly accepted that the last eukaryotic common ancestor possessed a proto-RNAi mechanism (Cerutti and Casas-Mollano 2006; Shabalina and Koonin 2008; Moran et al. 2017; Bråte et al. 2018; Velandia-Huerto et al. 2022); moreover, it is increasingly clear that miRNAs arose multiple times among eukaryotes, exploiting the same ancient RNAi components (Moran et al. 2017; Yazbeck et al. 2017; Bråte et al. 2018; Velandia-Huerto et al. 2022; but see Poole et al. 2014). Conversely, miRNAs and their hairpin precursors have been shown to be highly conserved within eukaryotic supergroups (Hertel and Stadler 2015; Yazbeck et al. 2017; Velandia-Huerto et al. 2022).

In metazoans, hundreds of conserved miRNA families have been identified (for instance, Yazbeck et al. 2017; Velandia-Huerto et al. 2022). If confirmed by the growing knowledge about miRNAs in non-model species, this would mean that the expansion of miRNA families in the kingdom is coincidental with, if not associated with, the diversification of body plans and ultimately the evolution of bilaterians (Hertel and Stadler 2015; Dexheimer and Cochella 2020; Desvignes et al. 2021; Ma et al. 2021). However, multicellular organisms are particularly prone to the evolution of complex regulatory networks by neutral processes, and the evolution of miRNAs in animals may not be adaptive at its roots (Lynch 2007).

To date, there is virtually no eukaryotic cell phenomenon that has not been shown to be regulated by miRNAs, from stress response (Larriba and del Mazo 2016; Riggs et al. 2018) to biomineralization (van Wijnen et al. 2013; Jiao et al. 2014), from immunity (Chen et al. 2013; Wang et al. 2018) to development and aging (Yekta et al. 2008; Kim and Lee 2019).

Retrograde signaling through RNA regulation: smithRNAs

The mitochondrion-to-nucleus communication is typically referred to as “retrograde signaling” or “mitochondrial retrograde response” (MRR; Ovciarikova et al. 2022), because it was always clear that the nucleus ought to regulate mitochondria in the eukaryotic cell, but the reverse regulatory function was not immediately understood. MRR may be mediated by cholesterol, reactive oxygen species and Ca2+ at nucleus-mitochondrion contact sites (Connelly et al. 2021). However, there are short RNAs (Maniataki and Mourelatos 2005; Weber-Lofti and Dietrich 2018), long noncoding RNAs (Vendramin et al. 2017; Weber-Lofti and Dietrich 2018), and peptides (Lee et al. 2013; Cohen 2014) of mitochondrial origin that have been proposed to interact with the nucleus.

Recently, it has been shown that sncRNAs with some similarities with miRNAs are involved in MRR as well; they were termed small mitochondrial highly expressed RNAs (smithRNAs) and were originally found in the Manila clam Ruditapes philippinarum (Pozzi et al. 2017). Small RNAs were already known from animal mitochondria (e.g., Mercer et al. 2011; Ro et al. 2013; Bottje et al. 2017; Riggs et al. 2018), but they had always been associated with mitochondrial targets (Mercer et al. 2011; Ro et al. 2013; Bottje et al. 2017). Conversely, smithRNAs are transcribed from the mitochondrial genome, but they regulate nuclear targets by definition. The complementarity of a small region of the sncRNA with the 3’ UTR of target messengers was shown to be a good predictor of regulated target genes (Pozzi et al. 2017; Passamonti et al. 2020).

The original in silico prediction of smithRNAs was subsequently confirmed by in vivo experiments, which also showed that smithRNAs can affect the epigenetic status of the nuclear genome by regulating histone methylation/acetylation (Passamonti et al. 2020). Finally, far from being a bivalve oddity, smithRNAs were suggested to be present in distantly related bilaterians (Passamonti et al. 2020). Notably, putative mitochondrial noncoding RNAs have also been found in Arabidopsis thaliana (Marker et al. 2002), as well as in other plants (Weber-Lofti and Dietrich 2018).

As most sncRNAs, smithRNAs may well be genetic elements that commonly arise de novo during evolution (Velandia-Huerto et al. 2022; and references therein). Duplication, reshuffling, transposition, retrotransposition, chimeric phenomena account for most new genes (Andersson et al. 2015; Schlotterer 2015; VanKuren and Long 2018; Zhao et al. 2021), but small noncoding loci like miRNAs may represent the most common source of de novo genes (Lu et al. 2008b; Lyu et al. 2014; Zhao et al. 2021). Most miRNAs arising de novo are probably functionless (Lu et al. 2008b; Berezikov et al. 2010) or even dead-on-arrival (Petrov et al. 1996; Petrov and Hartl 1998), but many may become adaptive miRNAs (Lu et al. 2008a; Mohammed et al. 2014, 2018, Lyu et al. 2014; Zhao et al. 2021).

Therefore, it can be stated that (i) at least some smithRNAs are miRNA-like molecules, structurally simple and requiring flexible base pairing to nuclear targets; (ii) at least some smithRNAs exert significant and broad-scope effects on the associated nuclear genome; (iii) smithRNAs may be widespread among animals and may have been present in the metazoan common ancestor; (iv) miRNA-like elements can easily evolve de novo, be conserved as adaptive traits, or be swept away by natural selection. Therefore, a fundamental evolutionary question arises: how common is the emergence of new smithRNAs and of novel smithRNA functions?

Target availability

As stated, at least some smithRNAs behave as animal miRNAs and require only partial pairing with 3’ UTRs of target nuclear messengers. Namely, the extended seed region required to basepair and regulate the target encompasses nucleotides 1–8 of the mature miRNA molecule (Bartel 2009; McGeary et al. 2019). Although cases of alternative and noncanonical pairing sites are known (see Tan et al. 2014; Bartel 2018; McGeary et al. 2019; Bofill-De Ros et al. 2020; Rissland 2020; Komatsu et al. 2023; and reference therein), a handful of nucleotides are anyway involved in target regulation.

To provide a rough estimate of the probability of a random sequence behaving as a miRNA-like regulatory element for a transcript within the same organism, we generated 189,339,429 random pri-miRNA-like sequences using custom-tailored Python scripts. The pri-miRNA is the canonical primary transcript of a miRNA element: it will be cleaved by the protein DROSHA within the nucleus at specific sites associated with its secondary structure, producing the pre-miRNA. As described above, the pre-miRNA will be cleaved by DICER in the cytoplasm to produce the functional molecule (Ghildiyal and Zamore 2009; García-López et al. 2013; Ha and Kim 2014; Bartel 2018; and reference therein). Sequences were randomly generated following the canonical pri-miRNA structure detailed in Bartel (2018): all sequences were then matured in silico, respecting the sites of DROSHA and DICER cleavage (see Ha and Kim 2014; Bartel 2018).

Since functional smithRNAs have been demonstrated in vivo in the Manila clam only (Passamonti et al. 2020), we assembled transcriptomes from 12 bivalve species for which transcriptome data are available on GenBank: Ruditapes decussatus (SRR527757); Arctica islandica (SRR1559269); Galeomma turtoni (SRR1560274); Sphaerium nucleus (SRR1561723); Laternula elliptica (SRR1687084); Lyonsia floridana (SRR1560310); Margaritifera margaritifera (SRR1560312); Arca noae (SRR1559268); Mytilus edulis (SRR1560431); Placopecten magellanicus (SRR1560445); Solemya velum (SRR330465); Yoldia eightsii (SRR3205073).

Transcriptomes were curated using the software FastQC (Andrews 2010), Trimmomatic (Bolger et al. 2014), BUSCO (Simão et al. 2015), and Trinity (Grabherr et al. 2011; Haas et al. 2013). The software Kraken2 (Wood et al. 2019) was used to classify potential contaminants of human and prokaryotic origin, using a custom-assembled database of prokaryotic sequences updated to June 2019. Peptide detection on noisy matured sequences was carried out with FrameDP (Gouzy et al. 2009), and 3’ UTRs were predicted using ExUTR (Huang and Teeling 2017) and the invertebrate dataset of 3’ UTRs.

In silico-matured RNAs were mapped onto assembled transcriptomes using Bowtie (Langmead et al. 2009), using the minus strand of the Bowtie index and requiring at least a perfect match between the 3’ UTR and nucleotides 2–8 of the simulated miRNA-like element, thus conservatively restricting the analysis to “canonical” targeting only. Scripts, commands, and settings are available by YLC and AF upon request.

The number of simulated miRNA-like elements able to find targets in the transcriptome was normalized over the number of k-mers (k = 22 nucleotides) available in the 3’ UTRs of the focal transcriptome: the result was divided by 189,339,429 (the number of random pri-miRNAs) to get an estimate of the probability for a single miRNA-like element to find a suitable target in a given k-mer.

The probability for a random pri-miRNA-like sequence to result in a mature miRNA having a target on a transcriptome is exponentially linked to the number of mismatches outside the seed region, irrespective of the species the transcriptome is obtained from (Fig. 1). Specifically, this probability is approximately one in a hundred million (1 × 10−8) if exactly five mismatches between the mature miRNA-like molecule and a 3’ UTR are considered (provided that the seed basepairs perfectly).

Fig. 1: Frequency of miRNA-like simulated molecules that found at least one suitable target on 3’ UTRs of a given species.
figure 1

The seed was conservatively defined as nucleotides 2–8 of the miRNA; a match was accepted if it was perfect at the seed and if it included a maximum of 5 mismatches outside. An example of an alignment with three mismatches is included in the insert. The number of elements with an acceptable match was normalized on the number of 22-mers in the relative 3’ UTR set and divided by the number of simulated pri-miRNAs. The y-axis is log-transformed for the sake of readability. Regression line details: y = 1.0757x − 12.8616; R2 = 0.9719; ***P < 2 × 10−16.

Recall the large amount of replicating mitochondrial genomes in the germline, and the huge number of individuals and populations of these species, one in a hundred million should be regarded as a high chance for a de novo-arisen mitochondrial miRNA-like element to find a regulative target in the nuclear transcriptome of the same cell. Notably, this probability does not change across species, which means that it is independent of nuclear transcriptome features.

It is worth noting that we conservatively focused on the 2–8 eptamer seed pairing, but other types of seed pairing are conceivable, and, thus, this probability is largely underestimated. Moreover, more than five mismatches are normally allowed in miRNA-driven regulation in animals (Shabalina and Koonin 2008; Ghildiyal and Zamore 2009; Bofill-De Ros et al. 2020), thus again increasing the chances for a de novo mitochondrial miRNA-like element, since the decimal logarithm of probability is positively correlated with mismatches outside the seed (r = +0.9858; Fig. 1).

If this trend is confirmed outside bivalves, it will be tempting to conclude that the DNA chemistry and nucleotide composition of eukaryotes, as well as constraints on pri-miRNA structures, do result in a significant probability that a miRNA-like element finds a suitable nuclear target, after having originated merely by chance and random mutations on a mitochondrial genome.

Mitochondrial secondary structures are easily co-opted to deliver new functions

Obviously, the probability of a simulated sequence to match a 3’ UTR is not enough to state that smithRNA commonly arises de novo. A smithRNA is a sncRNA associated with a specific biogenesis pathway, which requires molecular signals for processing enzymes, such as secondary structures.

In the traditional view, the animal mitochondrial genome is believed to be small and compact, containing a conserved set of protein-coding genes associated with the mitochondrial oxidative phosphorylation (OXPHOS) pathway (Boore 1999). However, recent research has shown that this may not always be the case, challenging the notion of ubiquitous features in metazoan mitochondrial genomics (Lavrov et al. 2013; Breton et al. 2014; Formaggioni et al. 2021). Actually, animal mitochondrial genomes are highly variable for what concerns genome architecture (Lavrov and Pett 2016); genome size (Pu et al. 2019; Hemmi et al. 2020); use of different genetic codes (Lavrov et al. 2013; Li et al. 2018); gene arrangement (Trindade Rosa et al. 2017; Pu et al. 2019; Hemmi et al. 2020; Monnens et al. 2020; Ghiselli et al. 2021; Kutyumov et al. 2021); doubly uniparental inheritance (DUI; Passamonti and Ghiselli 2009; Zouros and Rodakis 2019; Passamonti and Plazzi 2020); and post-transcriptional regulation (Osigus et al. 2017; Schuster et al. 2017).

The finetuning of some of these mechanisms (for instance, DUI, post-transcriptional regulation) and the origin of these features involves complex crosstalk with nuclear genomes, as well as the availability of regulatory sequences and signals along the mitochondrial genome (e.g., Ghiselli et al. 2013, 2021). For example, since mitochondrial DNA is normally transcribed as a single polycistron (e.g., Hillen et al. 2018), structural signals ought to be present to cleave single transcripts, which are normally found between protein-coding genes as tRNA genes or short noncoding regions with stem-and-loop secondary structures (e.g., Plazzi et al. 2013; Bettinazzi et al. 2016).

Therefore, mitochondrial genomics itself requires multiple secondary structures to regulate the organellar functions. Moreover, many of these structural sites are processing and cleavage signals, as is the case for protein-coding gene spacers, that are excised to separate single transcripts. These RNA hairpins are normally processed and degraded as part of the normal cellular turnover of macromolecules.

However, it is easy to speculate that a hairpin might survive being directly co-opted as pre-miRNA. It is sufficient that its secondary structure can be recognized by some DICER ortholog: hairpin structures that are normally found in cleavage signals are indeed very similar to hairpin structure normally shown by pre-miRNAs. In that case, the RNA would be cleaved and a miRNA would be produced skipping the pri-miRNA/DROSHA stage—and will find a suitable nuclear target one in a hundred million times, and probably more (as per our simulation above). Other examples of DROSHA-independent biogenesis of miRNAs are indeed known (Ruby et al. 2007; Babiarz et al. 2008; O’Brien et al. 2018).

Obviously, a hairpin excised within the mitochondrion must be delivered to the cytoplasm prior to the final, and in this case only, maturation step is driven by DICER. In fact, many studies found mitochondrial RNA outside the source organelle, which accounts for the possibility for RNA molecules to be exported. For example, several tRNAs of mitochondrial origin were found in the cytoplasm of human cells, even in association with Ago2, an Argonaute protein included in the formation of the functional complex involved in RNA silencing (Maniataki and Mourelatos 2005). Mitochondrially encoded RNAs can bind Ago2 as well (Pozzi and Dowling 2022), and long noncoding RNAs from the mitochondrion were also reported within the nucleus (Landerer et al. 2011; Rackham et al. 2011; Vendramin et al. 2017). Interestingly, mitochondria of R. philippinarum have been observed while releasing their content in the cytoplasm (Milani et al. 2011), which would be a straightforward mechanism for smithRNAs to enter cytoplasm, at least in this species.

RNAi driven by mitochondria might be a remnant of their origin as free-living, aerobic prokaryotes. Notably, the intracellular pathogen Mycobacterium marinum synthetize small, antisense regulatory RNAs, which are exported to the host cell and processed as if they were miRNAs (Furuse et al. 2014) and, generally speaking, many bacterial small RNAs show complex secondary structures (Wagner and Simons 1994). Indeed, a connection between small antisense regulatory RNAs in prokaryotes and the cytoplasmic proto-RNAi system in early eukaryotes has been suggested (Torri et al. 2022). In sum, we propose that smithRNAs arise as an exaptation at the molecular level of secondary structures that were always present in mitochondrial genomes, possibly since their origin as endosymbionts. Moreover, we also predict that this phenomenon might be more common than thought, given the similar selective constraints on hairpins.

Retrograde RNAi and mito-nuclear coadaptation

Mitochondrial and nuclear genomes must coevolve to provide efficient energy production (Hill 2019). The electron transport system of mitochondria (ETS), to which the efficiency of energy production through OXPHOS is strictly linked, is delivered by a complex assembly of nuclear and mitochondrial subunits that are forced to function together (Rand et al. 2004). An effective OXPHOS is achieved by three different mechanisms: (i) protein–protein interaction forming the ETS complexes (Phillips et al. 2010); (ii) protein–RNA/DNA interactions during transcription and translation of mitochondrial genes (Taanmann 1999; D’Souza and Minczuck 2018); and (iii) protein–DNA interaction in the replication of the mitochondrial genome (Clayton 2000).

In fact, speciation soon started to be discussed in the context of mito-nuclear coadaptation, as a mechanism that may easily evolve mito-nuclear incompatibilities (Dowling et al. 2008; Gershoni et al. 2009; Burton and Barreto 2012). Examples of these mito-nuclear incompatibilities are, for instance, available for Drosophila and Tigriopus copepods (see Hill 2019; and references therein).

Although the abovementioned system may suggest a strict need for mito-nuclear coadaptation, other systems point in the opposite direction. In bivalves with DUI, two mitochondrial genomes are transmitted to offspring in a sex-linked way (Passamonti and Ghiselli 2009; Zouros and Rodakis 2019; Passamonti and Plazzi 2020) and there is evidence of a functional assembly of the ETS with two, highly divergent sets of mitochondrial proteins. Therefore, the correct protein–protein interaction forming the ETS complexes is less strict than previously thought, at least in these bivalve mollusks.

The existence of mitochondrially mediated RNAi provides a fourth mechanism for the evolution of mito-nuclear incompatibilities, which can arise much faster than the other three. When a set of smithRNAs is adapted to regulate nuclear gene expression in a species, the system could easily produce genetic barriers with other species having a differently adapted smithRNA subset. To our knowledge, there is currently no study on this issue, but we strongly suggest that the cases of mito-nuclear incompatibilities may be reconsidered in light of the role of the mitochondrial genome in regulating nuclear gene expression. In this conception, smithRNAs (and maybe other MRR mechanisms) may represent classical Dobzhansky–Muller speciation triggers (Dobzhansky 1937; Muller 1942), which lead to the evolution of postzygotic genetic barriers.

Concluding remarks

Notwithstanding their recent discovery (Pozzi et al. 2017), it is likely that smithRNAs are not a peculiar feature of a single bivalve species: they are probably widespread among metazoans (Passamonti et al. 2020). This does not necessarily imply that they are phylogenetically related, nor that the origin of smithRNAs is a single event in evolutionary history. The peculiar features of mitochondrial genomes involve the possibility that smithRNAs spontaneously arose multiple times from the secondary structure repertoire that is normally available along the mitochondrial genome.

Therefore, it is important to characterize the smithRNA toolbox in as many animal species as possible, and functional studies are required to prove that smithRNAs are regulatory elements in vivo. This will increase the list of functions smithRNAs can exert in the cell; moreover, light will be shed on the evolutionary conservation of smithRNAs and on their multiple origins through molecular exaptation, both not mutually exclusive. Finally, if smithRNA precursors (or at least some of them) arise as an exaptation of ancient legacies from free-living bacteria, smithRNAs might be strictly connected with early eukaryogenesis.