Multiple viral infections in Agaricus bisporus - Characterisation of 18 unique RNA viruses and 8 ORFans identified by deep sequencing

Thirty unique non-host RNAs were sequenced in the cultivated fungus, Agaricus bisporus, comprising 18 viruses each encoding an RdRp domain with an additional 8 ORFans (non-host RNAs with no similarity to known sequences). Two viruses were multipartite with component RNAs showing correlative abundances and common 3′ motifs. The viruses, all positive sense single-stranded, were classified into diverse orders/families. Multiple infections of Agaricus may represent a diverse, dynamic and interactive viral ecosystem with sequence variability ranging over 2 orders of magnitude and evidence of recombination, horizontal gene transfer and variable fragment numbers. Large numbers of viral RNAs were detected in multiple Agaricus samples; up to 24 in samples symptomatic for disease and 8–17 in asymptomatic samples, suggesting adaptive strategies for co-existence. The viral composition of growing cultures was dynamic, with evidence of gains and losses depending on the environment and included new hypothetical viruses when compared with the current transcriptome and EST databases. As the non-cellular transmission of mycoviruses is rare, the founding infections may be ancient, preserved in wild Agaricus populations, which act as reservoirs for subsequent cell-to-cell infection when host populations are expanded massively through fungiculture.


Results
Ten dsRNA-enriched samples extracted from mushroom fruitbodies were sequenced and viral contigs were de novo assembled. Contig structures were confirmed by PCRs spanning the full length of the contigs and Sanger sequencing of select regions as well as RACE PCR and Sanger sequencing of the 5′ ends. These analyses revealed 30 distinct RNA molecules not found in the A. bisporus genome which ranged in size from 0.5 kb-14.5 kb (Table 1  and Supplementary Table S1). These sequences have been submitted to GenBank (accession numbers: KY357487 to KY357519, Supplementary Table S2) using the sequences present in sample 003 as reference where possible and sample 2990 for the remaining RNAs.
We propose these RNA molecules represent 18 distinct viruses based on the presence of RdRp domains in each contig (Table 1). A nomenclature is proposed for these viruses of Agaricus bisporus Virus N (N replaced by a sequential number). This new nomenclature takes note of previous and existing descriptions made for A. bisporus viruses: AbV1 -the causative agent of La France disease 21 and AbV4 30 and so starts at AbV2, AbV3, and continues with AbV5. In addition, a mitochondrial virus was identified and named AbMV1 (Agaricus bisporus Mitochondrial Virus 1).
Sixteen of the contigs were consistent with monopartite viruses. In addition we hypothesise two, novel, segmented viruses 1 : AbV6 consisting of AbV6 RNA 1 (C2) with an RdRp domain and AbV6 RNA 2 (C12) with a capsid-like domain with homology to Tobacco Mosaic Virus (TMV); and 2 AbV16 consisting of four separate contigs, AbV16 RNA 1 (C22) containing an RdRp domain, AbV16 RNA 2 (C20) containing a Vmethyltransferase, and AbV16 RNA 3 (C29) and AbV16 RNA4 (C33) each with open reading frames of unknown function. Both viruses contained shared motifs in their 3′ UTRs, no 5′ UTRs shared motifs were found in any of the sequenced RNAs.
We have assigned C13, C14 and C18 to represent a single viral molecule, AbV9, however the evidence for this is incomplete and it may represent a segmented virus. C13, C14 and C18 contained different domains (Vmethytranserase, peptidase, RdRp and viral helicase) (Fig. 1), were of similar GC content (58.2%, 58.4%, 59.4% respectively), and their abundances were tightly correlated (C13 vs C14 r = 0.99, C13 vs C18 r = 0.99, C14 vs C18 r = 0.99). As a separate molecule C13 did not contain an in-frame stop codon, and the larger hypothetical construct encodes a single ORF contiguous across all three contigs which encodes all four viral domains arranged in the same order as found in the Tymoviridae family. Assembly of C13, C14 and C18 consistently produced three separate contigs and the conjoining ends of each separate contig contained mononucleotide runs. We were able to PCR across the hypothetical construct albeit resulting in PCR products of multiple sizes at the contig junctions (not shown). The rich GC contents of the three contigs may have confounded assembly programs and/or inhibited PCR efficiency.
The abundance of the viruses and ORFans, determined as the number of fragments sequenced per kb of contig, varied widely among samples ( Table 2). The four AbV16 RNAs were found at high levels in the samples showing overt disease (003, 004 and 1497) but absent from the remainder. AbV6 RNAs were present at high levels in 7 of the samples while AbV2, AbV6, AbV10 and ORFans 2, 3, 4, 5 and 7 were found in all ten samples. AbV15, AbMV1 and ORFan1 were found in only two samples ( Table 2).
However, phylogenetic analysis based on comparisons of RdRp sequences with known viruses identified only two viruses that could be readily assigned to known viral families, and a further two formed a clade with a single Hypovirus like sequences. Three viruses were identified with significant similarity to members of the Hypoviridae family, AbV2 (14.6 kb), AbV10 (7.03 kb) and AbV11 (6.98 kb) ( Table 1). AbV2 encompassed the partial sequence previously identified by Sonnenberg and Lavrijssen 28 and had 28% sequence identity at the protein level to Cryphonectria hypovirus 2 (CHPV2). Several different variants of AbV2 were detected differing in their 5′ end sequences ( Fig. 2 and Supplementary  Table S3). We define the standard version of AbV2 as C23-C1 based on the prevalence of its constituent contigs (Supplementary Table S3). Some assemblies identified ORFan3 (C19) to be joined to AbV2. A longer variant of AbV2 (C19-C23-C1) was found in the assemblies in which ORFan3 was fused to its 5′ end and this was confirmed by PCR across the junction. A shorter variant, AbV2 (C19-C1) was assembled and confirmed by Sanger sequencing, where ORFan3 replaced a 2654 nucleotide section at the 5′ terminus of AbV2. This variant had a shorter ORF of 11 kb and lacked the N-terminal DEXDc helicase domain.
AbV2 variants C23-C1 and C19-C23-C1 contained a C terminal RdRp and helicase similar to other hypoviruses 31 (Fig. 2) as well as an atypical second DEXDc helicase domain near the N terminus which was more similar to helicases found in viruses of the Potyviridae family than other Hypoviruses (Fig. 2). No DEXDc helicase domains have been found in any other members of this order and the result is suggestive of horizontal gene transfer.

Sample Number
AbV14 and AbV15 were closely related with 57.7% amino-acid identity. The phylogeny of the Narnaviridae has been revised by the inclusion of three new A. bisporus viruses (AbV14, AbV15 and AbMV1), six hypothetical viruses identified here from the TSA databases and six more described by Cook et al. 33 (Supplementary Figure S2  and Supplementary Table S6).
The Tymovirales are monopartite ss(+)RNA viruses infecting plants and fungi with genomes of 5.4 kb-9 kb, the structure of which places them within the alpha-like super group of viruses.
The seven A. bisporus viruses in the Tymovirales have large replication ORFs of size range 5.9 kb-8.2 kb, with each containing Vmethyltransferase, protease C21, viral helicase and type II RdRp domains characteristic of members of the alpha-like supergroup. A majority also contained additional domains (Fig. 1). A new phylogeny of the Tymovirales is proposed by analysis of the RdRp domains of these seven viruses and two hypothetical viruses identified in the TSA database (from Agave tequilana: accessions GAHU01087027 and GAHU01094215) with high similarity to the C14 component (RdRp) of AbV9). This places the viruses into three new clades within the Tymovirales order, tentatively named Readiviridae, Emraviridae, Teagaviridae (Supplementary Figure S3).
We propose AbV6 is a bipartite segmented virus consisting of the AbV6 RNA 1, encoding a replicase domain, and AbV6 RNA 2 encoding a capsid-like motif. The abundance of AbV6 RNA1 and AbV6 RNA2 across all 10 samples is highly correlated (r = 0.95) ( Table 2). A distinct strain of AbV6 RNA1 (AbV6-2990), 8921b in length, was identified in sample 2990 with 88% pairwise identity with AbV6 RNA1 and with the same genome organisation. A 226 base motif was identified in the 3′ UTRs of AbV6 RNA1, strain AbV6-2990 and AbV6 RNA2 (Fig. 3).
AbV6 and strain AbV6-2990 both contain zinc-finger RING domains (Pfam 13920) at the N terminus. AbV3 and AbSV contained a tymovirus cysteine endopeptidase as well as an N terminal DEXDc helicase while AbV5 had a domain similar to a ribosomal RNA processing protein (PF08524).
Benyvirus like sequences. AbV8 and AbV13 have sequence similarity to the replicase segment of the Benyvirus genus within the Rubi-like virus grouping (Table 1 and Supplementary Figure 4). Of the two ORFs in AbV8, the larger contained Vmethyltransferase, helicase, protease and RdRp domains (Fig. 1) while the second ORF was of unknown function, although it contained a nucleocapsid motif as predicted by HHpred (e-val = 2.6E-05, p = 1.6E-09). AbV13 had a different genome coding strategy with three ORFs; ORF1 contained a helicase domain, ORF2 contained an RdRp and ORF3 coded for a protein of unknown function (Fig. 1).

AbV16 -the proposed causative agent for the Brown Cap Mushroom Disease. We propose
AbV16 to be a segmented virus consisting of four components: AbV16 RNA1, encoding a type II RdRp, AbV16 RNA2, encoding a type 1 Vmethyltransferase domain and AbV16 RNA3 and AbV16 RNA4 each of which encodes an ORF with no homology to known sequences (Fig. 1). All four molecules were found at high abundance in the three samples showing fruitbody browning symptoms (Table 2). In addition, all of the AbV16 RNAs contained a 25 base motif in the 3′ UTR, identified using MEME 34 (Fig. 4). The motif: "STTCAGSGTBBVWSHAGCRGTAAWT" had a probability of 8.2 × 10 −4 , and was the only motif found in AbV16 RNAs with a probability less than 1. The four AbV16 molecules and ORFan8 were homologous to the partial viral transcripts previously associated with browning symptoms reported by Eastwood et al. 23 (Supplementary Table S7).
Initial phylogenetic analysis of the RdRp domain of AbV16 RNA 1 was unable to place it consistently within any clades of the alphavirus-like supergroup. A BLASTP search of the TSA and EST databases using the translated ORFs revealed sixteen sequences with high similarity to AbV16 RNA 1 ( Table 3). Nine of these datasets also contained homologs to AbV16 RNA 2 while two contained homologs to AbV16 RNA 3 (Table 3). Phylogenetic analysis of the RdRp domains from viruses in the alphavirus-like supergroup, AbV16 RNA 1 and its homologs from the TSA and EST database searches (Table 3) placed AbV16 and its homologs into their own unique clade (Supplementary Figure S5) distinct from all previously described viral groupings. Therefore, the hypothesis is made for a new viral family which has been named Ambsetviridae (after the births Amber Deakin and Seth Dobbs during the course of this research).   ORFans. We identified eight further non-host RNAs ranging in size from 0.5 kb to 5 kb with potential open reading frames of greater than 250 bases (Supplementary Table S1). These sequences were named as ORFans rather than satellites as they lacked any 5′ or 3′ UTR homology to the viruses present and the potential ORFs lacked similarity to any known sequence. ORFans 4, 5, and 7 each consisted of several repeating regions and may be circular and potentially code for proteins of unknown function. ORFans 2, 3, 4, 5, and 7 were found in all samples. ORFan 1 and ORFan6 were found in samples 003 and 004 and then only in low copy number. ORFan 8 has sequence homologous to the RNA molecules VX7 (28; and Anton Sonnenberg: personal communication) and a transcript fragment identified by Eastwood et al. 23 associated with the fruitbody browning (Supplementary Table S7). Although ORFan8 had similar levels of abundance to the AbV16 RNAs (Table 2), it was not classified as a component of AbV16 virus as it lacked the 3′ motif.

Discussion
Multiple virus infections have been identified in mushroom fruitbodies of Agaricus bisporus by the discovery of 18 RNA viruses and 8 ORFans by deep sequencing. The viruses are phylogenetically distinct, the majority with low similarity to any virus previously described and were neither satellite nor defective RNAs. They represent the most extensive collection of unrelated multiple viral infections yet described, the closest comparable example being for a single diseased isolate of Ophiostoma novo-ulmi which contained 12 viral RNAs, all related mitochondrial viruses 32 .
Despite the multiple viral infections, the A. bisporus cultures grew and thrived, extracting nutrients from a complex substrate and producing fruitbodies, suggesting effective co-existence strategies and adaptive mechanisms. Samples 003, 004 and 1497 displayed only mildly debilitating (although economically important) symptoms of fruitbody browning while harbouring 17-24 viral RNAs (viruses and ORFans) (Table 3). Similarly, the 10-17 viral RNAs identified in fruitbodies grown from sub-cultures, and the 8 viral RNAs in commercial culture (Table 2), are characteristic of viruses with persistent life-styles and low fitness costs on the host 35 . The A. bisporus genome contains the components of viral defence mechanisms with homologs to both RNAi and yeast SuperKiller (SKI) mechanisms 19 and RNAi has been demonstrated in A. bisporus [36][37][38] . It is possible that some ORFs encode viral suppressors of RNAi but the identity and nature of any suppressors were not revealed by this research. However, the viral genomes contained many peptidases, numerous ORFs of unknown function and several domains of cryptic function (e.g. RING domain on AbV6 and rRNA processing on AbV5).
The 18 viruses have the closest amino acid homology to the RdRp of single-stranded RNA viruses. The assignment of 'single-strandedness' is at variance with both previous descriptions of the A. bisporus RNAs as double-stranded 10,25,28 and the intended specificity of the purification procedures. It is possible therefore that these viruses have secondary and tertiary structures that enable selective binding to cellulose and protection from S1 nuclease. Three of the viruses have domains with similarities to capsid proteins (AbV6 RNA2, MBV and AbV8), although no viral particles have been observed in MVX-infected tissues 10 . Two of the viruses are hypothesized to be multipartite (AbV6 and AbV16) on the basis of correlative abundance in different samples and a common 3′ motif.
The extent of the multiple infections reported here is suggestive of a diverse, dynamic and interactive ecosystem with evidence of sequence variation, alternative splicing, horizontal gene transfer, and variable fragment number (AbV16). Sequence variability ranged from 0.04-0.07 SNP's/kb (for ORFan2, ORFan3 and AbV5) suggesting a high fidelity replicase to two orders of magnitude higher for AbV16 RNA1, AbV14 and AbV16 RNA2 (8.9-10.1 SNP's/kb) (Supplementary Table S5). AbV2 displayed alternative splicing of C23, C19 and C19-C23 fused at the 5′ terminus and 2.6 kb from the 5′ terminus of C1. The C1 component of AbV2 contains different helicase domains from phylogenetically distinct orders/families suggestive of horizontal gene transfer ( Fig. 2 and Supplementary Figure S1b). Evidence supporting horizontal gene transfer has also been found between distinct viruses in Sclerotinia sclerotiorum 13 and between fungal species 39 . In this respect, the composition and abundance of A. bisporus viruses appears to be similar to that of plants and plant viruses, showing infection gains from the environment and losses probably due to competition and antagonistic interactions 14 . Local infection is inferred for sample 003 as both samples 003 and 004 were grown from A. bisporus inoculated compost from the same source, their fruitbodies collected on the same day, yet 003 had two additional viruses (AbV13 and AbV15) ( Table 2). Virus loss was indicated for sample 2735, as it had high abundance of AbV16 RNAs when first collected (data not shown) but no AbV16 RNAs detected in this study. Similarly, loss of AbV16 RNA1 has been demonstrated over 8 days compost culture between the first and second flush of fruitbody production (AbV16 RNA1 described as the 1.8 kb RNA or band 19 in ref. 40). The mechanism for the loss of AbV16 RNAs may be related to the large changes in levels of AbV16 detected during viral life-style transitions, persistent to acute, in individual fruitbodies 23 .
The Ambsetviridae proposed as a new viral family of plant and fungal viruses comprised the four component virus AbV16 and homologues found in the TSA and EST databases ( Table 3). The common 3′ motif in the four RNAs of AbV16 may represent part of a mechanism for co-ordinated replication. However, a further RNA molecule, ORFan8, was also tightly correlated in abundance with AbV16 RNAs yet lacked the 3′ motif (Table 2). AbV16 may thus have a variable number of fragments: AbV16 RNAs 1-4, with replication controlled by the common 3′ motif and with ORFan8, whose co-replication is determined by a different mechanism. The apparent discrepancy of 4 or 5 RNAs associated with the browning symptom has been previously reported by Grogan et al. 10 who identified 4 RNAs from UK samples while Sonnenberg and Lavrijsen 28 described 5 RNAs in samples from The Netherlands. The four components of AbV16 and ORFan8 encompass the transcript fragment sequences identified by Eastwood et al. 23 as being associated with a persistent/acute life-style transition (genome copies varying by 10 3 fold) and likely to be the aetiological agent of Brown Cap Mushroom Disease. The high variability found in AbV16 RNAs 1 and 2 is consistent with the high mutation rates associated with acute viruses 35 .
Scientific RepoRts | 7: 2469 | DOI:10.1038/s41598-017-01592-9 The fruitbodies grown from a non-diseased commercial culture (sample 138) also harboured ubiquitous viruses and ORFans (i.e. AbV2, AbV6, AbV10, ORFan2, ORFan3, ORFan4, ORFan5 and ORFan7). Two of these show high similarity to those previously reported (by sequence, size and presence in non-symptomatic fruitbodies): the AbV2 sequence corresponded to the 14-17 kb RNAs of Sonnenberg and Lavrijsen 28 , the 16.2 kbp RNA of Grogan et al. 10 and the >13 kbp L-RNAs of Kuang et al. 25 ; and ORFan2 is likely to be the 2.7 kb VX3 of Sonnenberg and Lavrijsen 28 , the2.4 kb S-RNA of Kuang et al. 25 and the 2.4 kb RNA of Grogan et al. 10 who identified this RNA in 99% of the 320 healthy or diseased mushroom samples tested. These ubiquitous and persistent viruses have clearly developed efficient strategies for replication 12,35 which may have been amplified by cultivation and breeding of the host.
Geographical differences in the viral sequences were identified. The sequence of the North American ORFan2 (ref. 25 and pers. com. Peter Romaine) had 106 bases different at the 3′ end from the European ORFan2. In addition strain AbV6-2990 was found only in the sample from the Middle East while the related strain AbV6 was found in both Middle Eastern and European samples. These differences may relate to geographically distinct sequence variants or to differences in local reservoirs of infection.
Mycoviruses are generally considered to lack infectivity as extracellular free particles however one report demonstrated infection by a purified DNA mycovirus 11 . This lack of infectivity suggests that the original infections of A. bisporus may be ancient, before widespread cultivation in the 17 th century 41 and probably represent numerous, separate infection events. The cultivation of A. bisporus is an international business involving long-distance movement of commercial cultures, colonised compost and mushroom fruitbodies, all of which are likely to increase the reservoirs of infected material and enhance virus dispersal leading to co-infections by viruses previously separated geographically. Virus transmission from wild to cultivated A. bisporus cultures probably occurred via aerially dispersed hyphae and spores. The longevity of basidiomycete fungi in the wild 42 , their propensity for vegetative growth and endurance would naturally facilitate the long-term survival of reservoirs of virus-infected mycelium in the environment.
Improved sequencing of host and virus genomes and bioinformatic techniques have suggested that multiple virus infections of eukaryotes may be more common than previously indicated by the classical application of Koch's postulates. In this case disease may be the consequence of an altered balance among members rather than the presence of one sequence per se. In the apathogenic state, interactions between host and viruses involve complex co-existence strategies for symbioses. However, a changed environment can lead one virus such as AbV16 to replicate to very high levels and become pathogenic. Our data suggest that diagnostic tests for mushroom browning may need to consider relative virus load as well as the presence or absence of critical RNAs. Furthermore, the potential exists to re-balance virus numbers in favour of apathogenicity if the environmental factors leading to selective amplification and transmission through mycelial networks can be understood. This will be the subject of further reports.

Materials and Methods
Biological material and sample collection. Next Generation Sequencing was performed on RNA extracted from ten samples of Agaricus bisporus mushroom fruitbodies (Table 4), all strain A15 (Sylvan: www. sylvaninc.com). Three of the samples consisted of diseased mushrooms (displaying the fruitbody browning symptom) collected from different mushroom farms in 2004 and 2011 (Table 2). Six further samples were from fruitbodies grown at Kinsealy Research Centre, Ireland in simulated commercial conditions of mycelial sub-cultures taken from diseased fruitbodies (displaying a range of viral disease symptoms) collected between [2000][2001][2002][2003][2004] and maintained in the laboratory as mycelial sub-cultures as per Grogan et al. 43 (Table 4). Fruitbodies from the sub-cultures did not display the brown tissue symptom. In addition fruitbodies were grown and collected (sample 138) from non-diseased commercial cultures (Sylvan Inc. Ireland) at Kinsealy Research Centre, Ireland. All fruitbodies were frozen in liquid nitrogen and stored at −80 °C until use.
RNA extraction from mushroom fruitbodies. Fruitbodies were freeze dried and ground in liquid nitrogen. 3.5 g dry weight of each ground mushroom sample was suspended in 4x w/v of an STE-based lysis buffer, pH7, and RNA was extracted with an equal volume of 5:1 phenol chloroform (pH 4.5). The RNA was purified  Table 4. Details of samples used for Next Generation Sequencing. and enriched for dsRNA by the addition of ethanol to 16%, followed by chromatography through two cellulose columns (medium fiber cellulose -Sigma-Aldrich, Dorset, UK), using a method modified from the protocol of Morris and Dodds 44 . 15-20 extractions from each fruitbody sample were combined before digestion of DNA and ssRNA by DNAse I and S1 Nuclease (Thermo Scientific, Waltham, UK) respectively. The amplified library was purified using Ampure beads (Beckman Coulter UK Ltd, High Wycombe, UK) and size distribution determined using a Tapestation D1K system (Agilent Technologies, Santa Clara, USA). Libraries were quantified by Picogreen (Invitrogen, Paisley, UK) and pooled to provide equal quantities of RNA per library.
Finally, a Quantitative PCR was performed, using Agilent qPCR Library Quantification Kit and a MX3005P instrument (Agilent Technologies, Santa Clara, USA), to measure the relative concentration of the pool compared to a previously sequenced mRNA library in order to determine the volume to use for sequencing. Sequencing was performed on an Illumina MiSeq with a read length of 150 bp, paired end reads and an average insert size of ~200 bp, at the Wellcome Trust Centre for Human Genetics (Oxford, UK).

Data filtering.
Fastqc was used to check the quality of the read data. Reads were trimmed using Trimmomatic to remove nucleotides with quality less than Phred33+15 and to only retain reads of 35 nucleotides or greater in length. Bowtie was used to align reads to Agaricus bisporus genes, known rRNA, Pseudomonas and PhiX and these reads were excluded from the assembly.
Assembly. De novo assembly was carried out using Velvet v1.2.08 and redundant contigs were removed using Oases v0.2.08 and the Fastx toolkit. Cap3 was used to assemble the smaller contigs into larger scaffolds, producing 39-140 contigs per sample.
Virus discovery pipeline. The NCBI Batch Web CD-Search Tool (default settings) was used to search the assembled contigs for conserved protein domains. Based on the domains returned contigs were assigned to one of four groups; (1) viral, (2) non-viral, (3) chimera (i.e. mix of viral and non-viral) or (4) unknown (if no domains found). Contigs which were deemed to be of non-viral or chimeral origin were discarded.
Open Reading Frames (ORFs) and the flanking Untranslated Regions (UTRs) were predicted using Geneious (6.0.6) 45 as well as the GC content for each of the contigs. The ORFs were interrogated using BLASTX to search for homology to viruses within the NCBI non-redundant protein sequence database. For sequences with no or low scoring BLASTX scores (e > 10 −2 ), the translated ORFs were searched for viral homologies within the non-redundant protein sequence database using DELTA-BLAST. All contigs (and minor ORFs) which remained unidentified were further searched for homologies using HHpred 46 . The identified contigs were assigned numbers C1 to Cn.
All identified viral sequences were then searched for homology, using TBLASTX, to sequences within the Expressed Sequence Tag (EST) and Transcriptome Shotgun Assembly (TSA) databases, to identify further unidentified hypothetical viruses.
Phylogenetics. Replicase sequences (e.g. sequences containing RdRp, helicase, viral methyltransferase protein domains etc.) for homologous and out-group viruses were downloaded from GenBank. Sequence alignment was carried out using MUSCLE 47 implemented in Geneious (6.0.6) 45 , with iterations set to 100 and clustering method set to Neighbor-joining (other settings default). Bayesian inference trees where constructed using MrBayes (2.0.6) 48 implemented in Geneious (6.0.6) 45 . Due to the divergent nature of some of the phylogenies, where indicated, Neighbor-joining trees were constructed using Geneious tree builder (6.0.6) 45 .
Motif Searching. MEME 4.9.1 34 was used to search for shared motifs in 5′ and 3′ UTR sequences. The maximum number of motifs to return was set to 20 and "Search given strand only" was checked. Other settings were kept as default. Pseudoknots were predicted in both 5′UTR and 3′ UTR using DotKnot_1.3.1 49 .

PCR Validation of Sequence Assembly.
To confirm the correct assembly of each contig, PCR primer pairs were designed to generate overlapping products of 600-900 bp in length designed to cover the entire length