Xrn1-resistant RNA motifs are disseminated throughout the RNA virome and are able to block scanning ribosomes

RNAs that are able to prevent degradation by the 5’–3’ exoribonuclease Xrn1 have emerged as crucial structures during infection by an increasing number of RNA viruses. Several plant viruses employ the so-called coremin motif, an Xrn1-resistant RNA that is usually located in 3’ untranslated regions. Investigation of its structural and sequence requirements has led to its identification in plant virus families beyond those in which the coremin motif was initially discovered. In this study, we identified coremin-like motifs that deviate from the original in the number of nucleotides present in the loop region of the 5’ proximal hairpin. They are present in a number of viral families that previously did not have an Xrn1-resistant RNA identified yet, including the double-stranded RNA virus families Hypoviridae and Chrysoviridae. Through systematic mutational analysis, we demonstrated that a coremin motif carrying a 6-nucleotide loop in the 5’ proximal hairpin generally requires a YGNNAD consensus for stalling Xrn1, similar to the previously determined YGAD consensus required for Xrn1 resistance of the original coremin motif. Furthermore, we determined the minimal requirements for the 3’ proximal hairpin. Since some putative coremin motifs were found in intergenic regions or coding sequences, we demonstrated their capacity for inhibiting translation through an in vitro ribosomal scanning inhibition assay. Consequently, this study provides a further expansion on the number of viral families with known Xrn1-resistant elements, while adding a novel, potentially regulatory function for this structure.


Design and production of DNA templates and in vitro RNA transcription
Templates for production of RNA to be tested for Xrn1 resistance were designed and produced as described in Dilweg et al. 10 , using forward and reverse oligos that were purchased from SigmaAldrich in desalted form, with reverse complementary 3' ends that allow compatibility in PCR reactions (see Supplemental data).This method yields products with a T7 promoter sequence (GTA ATA CGA CTC ACT ATA ), followed by a 12 nt leader sequence and the construct of interest, as depicted in Fig. 2. Production of DNA templates was validated by agarose gel electrophoresis and subsequently purified by ethanol/NaAc precipitation.Templates for CGA AAU , CGA AAC , CGA AAG , ACGAA and CGA AAA A lp1 constructs (Fig. 3) were instead acquired from the pMRL-derived plasmids that were produced for in vitro ribosomal scanning inhibition assays (Fig. 5).These carried a NcoI restriction site just downstream of the construct inserts.Thus, after linearization with NcoI, the product carried a T7 promoter sequence, followed by a 21 nt leader sequence (GGC TAG TTA AGA TAT AAC ATT) and the construct of interest.About 100 ng of linearized plasmid was used for run-off transcription.In vitro transcription was carried out for 30 min at 37 °C using T7 RiboMAX™ Large Scale RNA production System (Promega) for both template types.Reaction mixtures were treated with 1 unit RQ1 RNase-free DNase for 20 min at 37 °C.Transcript concentration was checked on agarose gel.

In vitro Xrn1 degradation assay
Per reaction, about 200 ng of transcript was treated either with or without RppH and Xrn1 (both New England Biolabs), as described earlier 10 .After the addition of an equal volume of denaturing loading buffer (8 M urea, 20 mM Tris-HCl, 20 mM EDTA, trace amounts of bromophenol blue and xylene cyanol FF), RNA was denatured for 5 min at 75 °C.These samples were run on 8 M urea 14% polyacrylamide gels in TBE buffer, equilibrated at 60-65 °C.In the case of MRV JP-B and PicaV-C constructs (Fig. 2B), 14% non-denaturing polyacrylamide gels in TAE buffer were used instead, as described in Dilweg et al. 10 .Gels were stained with EtBr and most constructs were subjected to this assay at least twice.Band intensities were quantified using ImageJ software.

Design of pScan and production of Renilla luciferase mRNA
The Renilla luciferase reporter vector pMRL 27 was digested with HindIII and MfeI in order to insert pairs of complementary oligonucleotides that introduced KspAI and Van91I restriction sites between the T7 promoter sequence and the start codon of the Renilla luciferase ORF.Digestion of the resulting plasmid with KspAI and Van91I allowed insertion of pairs of complementary oligonucleotides that housed the xrRNA BNYVV -derived constructs of interest.Subsequent linearization on the XhoI-site downstream of the Renilla luciferase ORF, and purification with the PureYield™ Plasmid Miniprep System (Promega), yielded products suitable for run-off transcription as described above, though at minimum 1 h incubation time was applied in order to allow for production of the larger transcript.Reaction mixtures were treated with 1 unit RQ1 RNase-free DNase for 20 min at 37 °C, after which 20 μL Milli-Q water was added.Free nucleotides were removed by filtration through illustra™ MicroSpinTM G-25 columns (GE Healthcare), and RNA concentration of the flowthrough was determined by measuring absorbance at 260 nm and checking by agarose gel electrophoresis.Transcript concentration was diluted to 25 ng/μL with Milli-Q water.

In vitro ribosomal scanning inhibition assay
For each measurement, a premix was prepared containing per sample 5 μL nuclease-treated rabbit reticulocyte lysate (Promega), 0.5 μL of 1 mM amino acid mixture without methionine, 0.5 μL of 1 mM amino acid mixture without lysine and 2 μL of 30-fold diluted Renilla-Glo™ Luciferase Assay Substrate (coelenterazine, Promega).Of this mixture, 8 μL was transferred to a 96-wells reaction plate per construct, in triplicate.To each well, 2 μL of 25 ng/μL RNA was added and mixed well, with intervals of 10 s.Luminescence was measured continuously for at minimum 100 min on a GloMax® Microplate Reader, with appropriate intervals between each well.For each time-point, means were calculated and normalized against the maximum mean luminescence reached by the sp.mut construct (Fig. 5) in order to gain values for relative luminescence, and to correct for differences in absolute luminescence between experiments.
(Fig. 1).Of note, several candidate sequences carried, besides the regular motif characteristics, an lp1 consisting of five, six, seven or eight nucleotides instead of the four found in xrRNA BNYVV .
These hits provide for several viral families a first indication for the presence of an xrRNA within their genome.Notably, the double-stranded RNA (dsRNA) virus families of Hypoviridae and Chrysoviridae carry several candidate xrRNA sequences, within the 3' UTR of Rosellinia necatrix hypovirus 2 (RnHV2), in a satellite-like RNA derived from Botrytis cinerea hypovirus 1 (BcHV1), and in the 3' UTRs of all four Chrysothrix chrysovirus 1 (CcCV1) genomic RNAs.For the hits found within Tombusviridae and Solemoviridae, this represents the first time that two different types of xrRNA are found within a viral family, as both families house xrRNA LT -type structures as well 22 .Intriguingly, the Polerovirus CYDV-RPV appears to house both an xrRNA LT in its IGR and a putative xrRNA C around the stop codon of its most downstream ORF.While xrRNA C was only identified in 3' UTRs before, here we identified multiple hits in IGRs and even a few in coding sequences (CDS).Of note is Pea enation mosaic virus-1 (PEMV-1), which carries a putative xrRNA C in its IGR for the isolate ID, or in a 3'UTR in isolate LK.

Xrn1 resistance of novel, putative xrRNA C hits
To verify whether these novel xrRNA C -like motifs function as actual xrRNAs, we tested Xrn1 resistance for some of the hits within their original contexts, using the exact hp1, spacer and hp2 sequence as they occur in the genome (Fig. 2A).Although in plants Xrn4 is the major exoribonuclease we used yeast Xrn1 as this enzyme has been found to resemble Xrn4 in many aspects 19,28 .Resistant constructs correspond with downshifted bands, where the single-stranded leader that is added to the construct is degraded, and Xrn1 is stalled at the nucleotide preceding hp1.All putative xrRNAs tested here appeared to show Xrn1 resistance (Fig. 2B).The heptaloop of RnHV2 appears Xrn1-resistant, although not all RNA has been subjected to degradation.This indicates that Xrn1 has not been able to associate with the construct, perhaps due to the A/U-rich nature of its hp2 forming interactions with the leader that prevent proper folding of the construct and Xrn1 landing on a single-stranded 5' end.Replacing the unstable hp2 by that of BNYVV.However, in the context of xrRNABNYVV the CGA GUA A loop clearly confers Xrn1 resistance.We note that also MRV JP-B contains the BNYVV hp2 since we anticipated problems with its A/U-rich hp2 as well (Fig. 1).The CYDV-RPV construct also appeared to be Xrn1-resistant despite consisting of multiple bands, a problem that was likely caused by sequence specific 3′ terminal addition of nucleotides by T7 RNA polymerase during transcription.This was largely solved by extending the DNA template and hence the RNA with an additional native hairpin (CYDV-RPV + hp3).

Interrogating the lp1-variation of xrRNA C
The above results seemed to indicate that even in the context of BNYVV xrRNA a pentaloop confers functional Xrn1 resistance.To make a direct comparison of the effect of loop composition on Xrn1 resistance possible we tested several loop variants in the BNYVV context (Fig. 3A).The constructs carrying lp1 hexaloops CGU GAA (BcHV1), CGC AAU (ETBTV, AEV-2, CjTLV & HnlV-4), CGA GAU (PicaV-C), CGG AAA (PEMV-1), CGC GAA (FgDFV1) and CGA AAU (AEV-1, MRV JP-B & RNA1 of CcCV1) were all clearly resistant to Xrn1 (Fig. 3B), www.nature.com/scientificreports/as well as the heptaloop CGA GUA A (RnHV2) and the octaloop CGC AAA GC (GRV, Fig. 3C).In contrast, the pentaloop CCGAA (TRV), did not show Xrn1 resistance.The strict requirements for specific nucleotides in the xrRNA C hp1 and spacer sequence invited us to determine to what extent the hexaloop sequence could be varied without disturbing the Xrn1 resistance.Since the fourth nucleotide of the xrRNA BNYVV tetraloop allowed for either an A, G or U 10 , and CGA AAU was resistant, we tested whether this applied to the hexaloop variant as well.Indeed, constructs carrying CGA AAA and CGA AAG were Xrn1-resistant, while CGA AAC was not (Fig. 3D).This suggests that the most downstream hexaloop nucleotide fulfills the same function as the most downstream tetraloop nucleotide.Furthermore, ACG AAA and AAC GAA constructs were made, to directly assess whether the additional two nucleotides of the hexaloop variant could be placed at any position without disturbing proper folding of the structure, and thus whether the CGAA motif of the tetraloop could just be shifted downstream.This does not seem to be the case, as these constructs did not retain Xrn1 resistance (Fig. 3D), indicating as well that the most upstream loop nucleotide is involved in the same interaction within either the hexaloop or the tetraloop xrRNA C variants.Interestingly, a CCG AAA loop, carrying the CGAA tetraloop motif combined with an additional C at the first hexaloop position, does not stall Xrn1 either (Fig. 3D), which either indicates that the G in the third hexaloop position cannot substitute for missing a G in the second position or that the C in the second position is disruptive for proper www.nature.com/scientificreports/folding.Both arguments provide a potential explanation for why the pentaloop CCGAA did not show proper Xrn1 resistance either.
In order to further characterize the hexaloop variant and to compare it to the tetraloop variant, constructs containing CGU AAA , CGA UAA , CGA AUA were tested, since in the tetraloop xrRNA C a CGUA lp1 was not Xrn1-resistant 10 .From these constructs, only CGA AUA did not retain Xrn1 resistance, suggesting that within the hexaloops, the fifth nucleotide fulfills the role of the third tetraloop nucleotide (Fig. 2E).This hypothesis was tested further by assessing the Xrn1 resistance of constructs containing a CGA CAA or CGA ACA loop, corresponding with the non-functional CGAC tetraloop lp1.Indeed, the fact that CGA CAA retains Xrn1 resistance, while CGA ACA and CGA AAC do not (Fig. 3E), appears to confirm this hypothesis, while also indicating that hexaloop position 4 could be any nucleotide, especially considering the tested natural hexaloops appear to follow this trend as well.While the natural hexaloop sequences discovered all contained a C as their first loop nucleotide, the tetraloop xrRNA C was previously determined to allow for a YGAD consensus.The variants UGG GAA , UGG AAA and UGC GAA were tested in order to see if this were true for the hexaloop xrRNA C constructs as well (Fig. 3F).When compared to their C-carrying counterparts, this was the case, although the amount of RNA leftover did decrease significantly, comparable to what occurs for a UGAA lp1 10 .The partial resistance of UGG GAA , UGG AAA , UGC GAA , and CGA AAG (Fig. 3D) is possibly due to the formation of an additional base pair between positions 1 and 6 in the loop, leading to a stable GNRA tetraloop that is interfering with the function of the hexaloop.
In comparing the tetraloop xrRNA C YGAD consensus with a hexaloop counterpart, a next set of constructs was aimed at figuring out whether the necessary position of the tetraloop GA is allowed to be changed within the hexaloop equivalent.As such, CAG AAA , CAA GAA and CAA AGA loops were tested, and all turned out to be unable to stall Xrn1 (Fig. 3G), indicating that the position of the loop G has to be retained.Previous work has indicated that within the tetraloop xrRNA C , a CAGA loop is not Xrn1-resistant either 10 .
Overall, the hexaloop variants tested in this study point towards a conservation of the tetraloop xrRNA C YGAD consensus, with any two nucleotides in between the second and third positions (YGNNAD).Considering this knowledge, we were curious to see whether it would be possible to rescue the non-functional CCGAA by testing a sequence that would follow YGNAD.This was not the case however, as CGAAA did not result in any Xrn1-resistant RNA (Fig. 3H).As expected from the non-functional ACG AAA hexaloop, an ACGAA was not Xrn1-resistant either.Furthermore, the Xrn1-resistant heptaloop variant of RnHV2 inspired us to test the more simple CGA AAA A loop, which did appear to be completely resistant (Fig. 3H), further indicating that the YGAD motif is necessary and could be extended through extra nucleotides in the middle, but that more than one extra nt is needed.We note that not all input RNA of ACGAA and CGA AAA A was digested by Xrn1; this is probably due to the use of a different leader sequence in these constructs (see also Materials and methods).

Finding the minimal hp2 for xrRNA C
Previous work on the xrRNA C motif has established that the second hairpin hp2 is absolutely required for Xrn1 resistance, although it is not conserved at sequence level 10 .Finding novel putative xrRNA hits with comparable hp1 and spacer, but quite variable downstream sequences (Fig. 1) underscores the need for determining more exactly what is minimally required downstream to keep Xrn1 resistance.Overall, through systematic substitution of several elements within, or flanking hp2, we noticed that most changes allowed for sustained Xrn1 resistance (Fig. 4).These changes included deletion of the A22 flanking hp2, or both nucleotides flanking hp2; substituting G35 for a U; reducing hp2 down to two G-C bps, with either a thermodynamically stable GAAA, or a regular AUUU tetraloop.Although the latter construct still retained Xrn1 resistance, this variant showed many additional bands indicating either undenatured or misfolded intermediates, making the analysis less reliable.Likewise the removal of the 3' A from the latter construct (Fig. 4, last two lanes) or replacing the two G-C bps by A-U bps still allowed for Xrn1 resistance but also led to the appearance of additional bands that were not Xrn1 resistant.From this we conclude that a 2-bps hp2 is sufficient to stall Xrn1 but it should preferably by capped by a stable tetraloop.

Stalling scanning ribosomes by xrRNA C
Finding putative xrRNAs in IGRs and CDSs, and the ability to stall and prevent the helicase activity of Xrn1, led us to investigate whether xrRNA C structures are able to stall scanning ribosomes.This was tested by cloning xrRNA BNYVV and mutated versions within the 5' UTR of a luciferase reporter plasmid (Fig. 5A).The mRNA derived from these plasmids was used in a rabbit reticulocyte lysate in vitro translation system, in which conversion of substrate by produced luciferase enzymes was tracked over time by detecting the luminescence resulting from this reaction.The degree of ribosomal stalling by xrRNA BNYVV (wildtype) was determined through comparison with a construct carrying a substitution of the spacer C and U with two A's (sp.mut), a version that has previously been shown to be unable to stall Xrn1 10 .It appears that the production of luciferase occurs much more rapidly within the sp.mut construct, leading to roughly a five-fold increase in maximum relative luminescence (MRL) after about an hour of translation compared to the wildtype construct (Fig. 5; see Supplementary figure S1 for time traces).This indicates that the identity of the spacer sequence also plays a crucial role in the extent to which ribosomal scanning can be delayed.
The Xrn1-digestion assays using xrRNA C motifs with loops carrying five, six or seven nucleotides yielded the notion that pentaloops are mostly unable to stall Xrn1, whereas hexaloops could, given that they follow a YGNNAD consensus.The consensus for a heptaloop was not investigated as thoroughly, but both a CGA GUA A and a CGA AAA A loop were found to be Xrn1 resistant.Tracking the luminescence for the constructs carrying the pentaloops CGAAA (lp1a), UGAAG (lp1b) and ACGAA (lp1c), all show a strong loss of ribosomal stalling (Fig. 5), even worse than demonstrated by sp.mut.Conversely, hexaloop lp1 constructs revealed a measure of ribosomal stalling that roughly equals (lp1f, CGA AAG ), or even surpasses (lp1d, CGA AAA; lp1e, CGA AAU )   Since many of these motifs were found in plant viruses, we investigated the main constructs also in a wheat germ extract.This led to basically the same results (Supplementary Fig. 2), demonstrating that these Xrn1resistant motifs can impede ribosomal scanning irrespective of ribosome origin.

Discussion
The relatively small size of the Xrn1-resistant coremin motif and the lack of structural information currently available, keeps how exactly Xrn1 is unable to progress through xrRNA C elusive.The types of xrRNA found in Flaviviridae and Tombusviridae, have been characterized through mechanistic studies and crystal structures, indicating elaborate assemblies of stem-loops, pseudoknots, and additional tertiary interactions [17][18][19]28,29 forming a ring-like structure that, when approached from the 5' side, serves as a mechanical and topological blockade that Xrn1 cannot progress through. The iitially predicted secondary structure configuration for xrRNA BNYVV , with two small hairpins separated by a spacer, does not easily inspire a way for imagining a similar mode of stalling.In the absence of crystallographic data, we have further characterized this motif through Xrn1 degradation assays on a large variety of constructs based on the xrRNA BNYVV .These have expanded the currently known distribution of putative xrRNA C s, and further clarified what is minimally required at the sequence and nucleotide level for Xrn1 to be stalled.

Loop size matters
Through GenBank BLAST searches, several xrRNA C -like viral sequences were found that carried a lp1 of five or more nucleotides, instead of the xrRNA BNYVV tetraloop.Most of the constructs containing such loops were Xrn1-resistant, either within the xrRNA BNYVV context (Fig. 3), or in their own (Fig. 2).Systematic variation of hexaloop-containing xrRNA C motifs led us to propose a consensus sequence of YGNNAD for the lp1.A notable exception to this consensus discovered through our BLAST searches however, are two of the four putative xrRNAs demonstrated within the Alphachrysovirus CcCV1.Both carry a C at the last position, which highlights the importance of testing these sequences within their own genomic context, and therefore cannot be conclusively used for determining the Xrn1 resistance of a similar motif.The lp1 consensus suggests that the middle two nucleotides bulge out, while the flanking nucleotides may be involved in an interaction with the spacer as proposed earlier for the YGAD consensus within xrRNA BNYVV 10 .This raises the issue whether the two-seemingly uninvolved-nucleotides have a non-structural function, or whether they are actually redundant.The fact that even the predicted GRV octaloop yields an Xrn1-resistant structure within the xrRNA BNYVV context, suggests that more redundant nucleotides are allowed within lp1.Interestingly, the GRV octaloop ends with a C as well, which suggests that such a loop does not follow a YGNNNNAD consensus.Therefore, in this octaloop variant, the putative interaction that involves the most downstream nucleotide may instead involve a more upstream one.The viral context of the GRV candidate xrRNA is notable, with the predicted hp1 stem and spacer mostly identical to xrRNA BNYVV , immediately followed by a stable hp2.However, this motif is followed by the stop codon of an annotated hypothetical protein, suggesting it is part of a coding sequence.Therefore, in this particular case, the presence of more than four nucleotides in the loop sequence could be explained by their role as specific codons in translation.The pentaloop-carrying xrRNA C -like motif found in an isolate of TRV RNA2, was determined to not yield an Xrn1-resistant structure.This contrasts the Xrn1-resistant motif that carried the regular CGAA lp1 found in the TRV RNA2 isolate tested in our earlier study 10 .Only one isolate with a pentaloop was discovered through our BLAST searches, contrasting the number of hexaloops-containing motifs found, which suggests that the pentaloop-containing TRV xrRNA C -like motif may have evolved to become non-functional, or perhaps holds another function entirely.

Correlation between 5′UTR located xrRNA motifs and presence of an IRES
Several putative xrRNAC were identified in dsRNA viruses of the Hypoviridae and Chrysoviridae families (Figs. 1, 2B).To our knowledge, this shows for the first time that putative xrRNAs may exist outside the realm of single-stranded RNA viruses.The fact that all four genomic RNAs of CcCV1 carry an xrRNAC-like motif at the 5' end of the 3' UTRs is a strong indication for their Xrn1-resistant functionality.Due to only BLAST'ing against single-stranded RNA viruses in our previous study 10 , we now additionally identified tetraloop versions of the xrRNAC motif within the 5' UTRs of Wuhan insect virus 14 and Sclerotinia sclerotiorum hypovirus 2, both members of the Hypoviridae family (Fig. 1).While the conservation of IRES-like elements in the 5' UTR of Hypoviridae is yet uncertain, for several species within these mycoviruses IRES activity has been implied 30,31 , and the location of the xrRNA C motifs within these species does allow for enough space between it and the polyprotein start codons.Therefore, the relationship between xrRNAs in 5' UTRs and the presence of IRES structures reduces the chance for such motifs to encounter scanning ribosomes.However, translation initiation on multicistronic viral RNAs is not always accounted for, and it cannot be ruled out that undiscovered xrRNA C -like motifs, or other types of xrRNA, could serve a regulatory role within coding sequences of viruses or elsewhere.At the least, xrRNA C being able to stall scanning ribosomes, provides an explanation for why most of these motifs presently known are located in the 3' UTR, where they cannot interfere with translation processes for viruses that initiate from a 5' cap.While most members of the Flaviviridae carry strongly conserved xrRNAs in their 3' UTR 32,33 , xrRNAs have also been discovered in the 5' UTR of Hepatitis C virus and Bovine diarrhea virus 34 .These viruses have an IRES downstream, which allow them to continue initiation of translation even after losing the 5' cap and being subjected to Xrn1 degradation.Furthermore, this IRES allows them to bypass ribosomal scanning from the 5' end, and thus would prevent the stalling that could occur from the xrRNA.
A recent study on the distribution of xrRNA LT s has demonstrated their presence throughout the Tombusviridae and Solemoviridae families 22 .Here we show how instead, at least two species of Umbravirus, ETBTV and GRV that belong to the Tombusviridae, carry a putative xrRNA C .As such, these findings indicate two divergent types of xrRNA present within the Umbravirus genus.Moreover, the Polerovirus CYDV-RPV carries both an xrRNA LT in its IGR, and a putative xrRNA C that is partly located within a CDS.Like the putative xrRNA of GRV, this latter motif embeds a stop codon Whether these sequences play a role in translation by slowing down of ribosomes thereby affecting nascent protein folding similar to e.g.G-quadruplexes 35 or play a role in RNA silencing suppression, similar to xrRNA BNYVV , will require further study.
In this study, we show how xrRNA BNYVV is able to stall scanning ribosomes, leading to a significantly lower production of luciferase compared to constructs that harbor mutations that are known to abolish their Xrn1 resistance (Fig. 5).Like xrRNA LT s discovered in earlier studies 22 , the putative xrRNA C motifs found in this study are located both in IGRs and 3' UTRs, and their function likely varies depending on this location.In order to regulate translation of their uncapped RNA, Tombusviridae make use of 3' cap-independent translation enhancers 23,36,37 .This provides a potential role for xrRNAs located in IGRs, since subgenomic RNA that result from stalling of Xrn1 may retain a certain level of translational activity.As such, ORFs located on these subgenomic RNAs are subjected to translational regulation through either protection of the RNA from degradation, or-in the case of an xrRNA not located at the 5' end of the subgenomic RNA-through stalling of the scanning ribosome.These findings, and the fact that novel xrRNA C candidates are found within IGRs, and even (at the end of) CDSs, highlight the importance of mapping the interplay of translation regulation and Xrn1-mediated decay that viruses employ.

Role of hp2
Most of the currently predicted xrRNA C sequences allow for a relatively stable hp2 (Fig. 1).Exceptions are the second motif found in CjTLV (a 3-bp hp2 with a G-U loop-closing bp that is unlikely to form), and the A-Uand G-U-rich motifs of CcCV1.However, we have tested several hp2 mutants of the xrRNA BNYVV in order to pinpoint what is minimally required for stalling Xrn1.This resulted in the conclusion that A22 and nucleotides downstream of hp2 were not essential for stalling Xrn1, and that a two-bp hp2 is sufficient provided it is capped with a stable tetraloop.Therefore, it can be deduced that the theoretical, shortest Xrn1-resistant sequence based on the xrRNA C motif would be 5'-GUC CGA AGA CGU UAA ACU ACG GGA AACCA-3' .It should be noted, however, that this would likely only be the case in vitro.Within the context of a highly structured UTR, this sequence would possibly not fold in such a way that the specific Xrn1-resistant topology could be stably maintained.So what exactly is the role of hp2 in stalling Xrn1?In flaviviral xrRNA, the pseudoknot involving its apical loop folds around the 5' end from which Xrn1 approaches, causing the enzyme to brace against the ring-like topology, halting degradation 17,18,28 .If in the xrRNA C motif, hp2 only functions to brace against the enzyme, it would explain why any small but stable hairpin retains the construct's Xrn1 resistance.Xrn1 is halted at one nucleotide upstream of hp1 21 , which means that the first one or two nucleotides of hp1 actually enter the active site of the enzyme 38 .It is therefore unlikely that hp2 functions in a similar fashion, serving as a topological blockade for Xrn1, as the enzyme likely has to brace against the surface of hp1 in order reach the predicted stalling site.Conversely, the exclusively structural conservation, and this study showing the need for just a small hairpin, actually do suggest a mechanical function.

Correlation between thermodynamic stability of xrRNA and stalling of ribosomes and Xrn1
Following the discovery that xrRNA BNYVV is able to stall scanning ribosomes, we were eager to find out whether there is a positive correlation between that ability, and Xrn1 resistance.Most of the constructs tested for ribosomal stalling capacity indeed show that a loss of Xrn1 resistance also coincides with a loss of ribosomal stalling.This correlation is well pronounced for the substitution of spacer nucleotides 18 and 19 (sp.mut), which are known to be crucial for stalling Xrn1.Furthermore, the constructs testing the Xrn1-resistant hexaloop variants CGA AAA , CGA AAU and CGA AAG , indicated ribosomal stalling on par with, or better than the wildtype.In contrast, the non-Xrn1 resistant CGA AAC lp1 appears to lose this ability at least partially, reaching an MRL comparable to the spacer mutant.As such, these assays provide a potential for additional information on the structural integrity of xrRNA C variants, where Xrn1-digestion assays do not account for structures 'more' resistant than wildtype.Furthermore, the pentaloop constructs, which are all unable to resist Xrn1, appear to slow down ribosomes even less than the spacer mutant, indicating an even stronger loss of thermodynamic stability and/or tertiary structure than caused by mutations in the spacer.
We must however take into account that certain substitutions within the xrRNA BNYVV may influence not only thermodynamic stability of the construct, but also the ratio of functionally versus non-functionally folded structures.A recent study highlighted the importance of this folding process for Zika xrRNA, showing how misfolded intermediates without the intricate structure necessary for stalling Xrn1 may ultimately form 39 .Furthermore, Xrn1 and ribosomes do not process RNA in the same way.While Xrn1 is unable to progress through an xrRNA structure even after 20 h of incubation (Supplementary Fig. S3), this in vitro translation assay shows that in constructs that stall scanning ribosomes, luciferase is ultimately produced.This suggests that at least for a subset of mRNAs, the ribosomes are able to progress.
The manual assessment of novel xrRNA C motifs from BLAST-searches is unlikely to be exhaustive, and it is therefore likely that more viruses carrying such structures can be found.Since the intrafamilial conservation of these motifs was not explored thoroughly, it remains unclear to what extent these structures are abundant throughout.The expansion of lp1 sequences that may confer Xrn1 resistance in the context of xrRNA BNYVV , and the minimal hairpin that is required, should aid further interrogation of viral genomes for these motifs.Knowing that they are able to stall scanning ribosomes, we may also look into the 5' UTRs of viral families with internal translation initiation capacities.Conversely, how often xrRNA C motifs are positioned in the genome such that they mostly evade ribosomes, as opposed to a position where they may provide a more regulatory, ribosomeinhibiting function, is yet open to question.

Figure 2 .
Figure 2. (A) Sequences of constructs that were tested for in vitro Xrn1 degradation assays based on the BLAST results listed in Fig. 1.The 5' leader sequence is given in grey, and the predicted hp1 and hp2 stems in green and red, respectively.An additional hairpin identified downstream of CYDV-RPV xrRNA is given in blue.(B) Denaturing (or non-denaturing, in the case of the fourth gel from the left) polyacrylamide gels showing the results for in vitro Xrn1 degradation on the constructs listed in (A).Data below the gels indicate the average (± SD) percentage of Xrn1-resistant RNA.BcHV1, RnHV2 and CYDV-RPV constructs were measured only once.

Figure 3 .
Figure 3. (A) Template construct used for in vitro Xrn1 degradation assays based on xrRNA BNYVV with the predicted secondary structure arrangement illustrated and the 5' leader sequence given in grey.(B-H) Denaturing polyacrylamide gels showing the results for in vitro Xrn1 degradation assays aimed at lp1 variants.Boxes above the gels depict what lp1 variants are tested in the corresponding lanes.RNA constructs are treated either with ( +) or without (−) RppH and Xrn1.Note that the CGA AAU , CGA AAC , CGA AAG , ACGAA and CGA AAA A lp1 constructs were derived from plasmids used for the in vitro translation assays (see "Materials and methods" section), and thus did not have the same initial length.Data below the gels indicate the average (± SD) percentage of Xrn1-resistant RNA.Certain constructs in (E) were measured only once.
https://doi.org/10.1038/s41598-023-43001-4 that of the wildtype construct.The CGA AAC lp1 (lp1g) instead reached a MRL of about 0.8 times that of sp.mut.These results suggest that the loss of Xrn1 resistance by several penta-and hexaloops, generally coincides with the loss of ribosomal stalling capacity, while the hexaloops found to be Xrn1 resistant show the potential of stalling ribosomes more than the wildtype xrRNA BNYVV motif.A construct carrying the CGA AAA A heptaloop (lp1h) reached an MRL that was twice that of the WT construct, although still around 0.4 times the MRL of sp.mut, indicating that heptaloops in an xrRNA C may partially retain the ability to stall ribosomal scanning.

Figure 4 .
Figure 4. Template construct used for in vitro Xrn1 degradation assays based on xrRNA BNYVV with the predicted secondary structure arrangement illustrated and the 5' leader sequence given in grey is depicted above.Underneath are denaturing polyacrylamide gels showing the results for in vitro Xrn1 degradation assays aimed at hp2 variants.Boxes above the gels depict what hp2 variants are tested in the corresponding lanes.RNA constructs are treated either with ( +) or without (−) RppH and Xrn1.Data below the gels indicate the average (± SD) percentage of Xrn1-resistant RNA.N.D. indicates that the percentage could not be reliably determined.

Figure 5 .
Figure 5.In vitro ribosomal scanning inhibition assay.Mutations relative to the wildtype are given underneath, with dashes indicating no change.The wildtype sequence is numbered in grey and shows nucleotides involved in stems of hp1 and hp2 in green and red, respectively.Maximum relative luminescence.Data are presented as mean of measurements in triplicate, normalized to the maximum luminescence reached by sp.mut, with error bars depicting ± SD.