Main

miRNAs are conserved, endogenous 21–23-nucleotide (nt) noncoding RNA molecules that regulate protein-coding gene expression in plants and animals via the RNA silencing machinery (for reviews see refs. 1 and 2). Depending on the extent of complementarity, miRNAs can either direct cleavage or regulate translation of their target mRNAs. Candidate targets of animal miRNAs have been computationally predicted by various groups (reviewed in ref. 1), but only a very small subset of these predictions have been validated.

Typically, miRNAs are derived from long pol II primary transcripts3 that are processed in the nucleus by the ribonuclease III enzyme Drosha together with the double-stranded RNA (dsRNA)-binding domain containing the protein DGCR8 (reviewed in ref. 4). The excised 70-nt fold-back dsRNA precursor (pre-miRNA) is then exported to the cytoplasm by the export factor Exportin 5 (ref. 5). The pre-miRNA is later cleaved by the ribonuclease III enzyme Dicer to excise the miRNA in the form of a small interfering RNA (siRNA)-like duplex6, and this is followed by asymmetric assembly of the miRNA strand into effector complexes containing the Argonaute protein.

The majority of miRNAs were identified by cDNA cloning, whereas others have been identified by genetic analysis or by computer prediction. Plant and invertebrate genomes encode just over a hundred miRNA genes; vertebrate genomes encode several hundred miRNA genes. Most recently, several miRNA genes have been discovered in EBV, a member of the herpesvirus family7. A searchable database containing all known miRNA sequences is available8 (http://www.sanger.ac.uk/Software/Rfam/mirna/).

The discovery of miRNAs in a herpesvirus indicates that viruses have evolved to exploit RNA silencing for regulation of both host and viral genes7. The EBV miRNAs have no notable sequence similarity with known host cell miRNAs. Additionally, sequence homology searches against the genomes of other human herpesviruses have not revealed any orthologous miRNAs. This observation indicated that, unlike host miRNAs, viral miRNAs are difficult to identify using existing miRNA gene prediction software, which heavily relies on cross-species sequence conservation9,10. Here we present a new computational method that predicts likely locations of miRNA precursors in individual genomes using only information about the local sequence composition and about the predicted secondary structure of RNA transcripts. Applying this method to a subset of pathogenic viruses, we predicted that DNA viruses, and in particular the family of herpesviruses, would be the most likely to encode miRNAs. By small-RNA cloning, we confirmed that KSHV and MHV68, two members of the gammaherpesvirus subfamily, and HCMV, a member of the betaherpesvirus subfamily, encode miRNAs. At the same time, we neither predicted nor experimentally detected miRNAs in human cells infected by small-genome RNA viruses, including yellow fever virus (YFV), hepatitis C virus (HCV) and human immunodeficiency virus 1 (HIV1).

Results

Computational method of pre-miRNA prediction

Our computational method assumes that miRNA precursors have specific composition and secondary structure features that are recognized by the miRNA-processing enzymes during miRNA biogenesis. We therefore described each candidate stem in terms of a set of properties (detailed in Methods) and then trained a support vector machine (SVM) to separate known miRNA precursors from a set of stem-loops unlikely to contain miRNAs. To report specific miRNA precursor predictions, we set a score threshold, which we chose as to achieve a good tradeoff between sensitivity and specificity in the training data, namely to retrieve 71% of the positive miRNAs in the training set while misclassifying only 3% of the negative examples. Candidate stems to which the SVM assigns a score above this threshold are considered miRNA precursor predictions. To prioritize our cloning studies, we also calculated the probability that at least one of the candidate stems in a given viral genome is a miRNA precursor. This indicator takes into account the full distribution of SVM scores assigned to all the candidate stems in the genome and is therefore not dependent on a score threshold.

We applied our computational method to at least one representative member of each family of human pathogenic viruses and predicted that many members of the herpesvirus family encode miRNAs (Table 1). In contrast, for RNA viruses we generally predict low pro-babilities of encoding miRNAs. This is not a mere consequence of their smaller genomes, as can be inferred from the fact that the calculated probability of encoding at least one miRNA varies among herpesviruses from 45.8% for HHV7 to 100% for HHV2 and HHV8, in spite of their large and comparable genome sizes.

Table 1 Summary of pre-miRNA predictions for selected human and mouse DNA and RNA viruses

We also used our method to predict the specific location of miRNA precursors in the viral genomes, using a threshold score as described above. The number of stems with scores higher than the threshold is also not a direct function of the genome size (Fig. 1). For EBV, seven pre-miRNAs were predicted to originate as transcripts from five distinct genomic loci (Supplementary Table 1 online). Two of these loci had pre-miRNAs predicted from the sense and antisense transcripts. Of the five cloned miRNAs from latent-stage infection7, two were predicted correctly, and a third corresponded to a predicted complementary strand. It is conceivable either that the other predicted mi-RNAs were expressed at other stages of the infection cycle and therefore escaped cloning or that they were false-positive predictions.

Figure 1: Display of the miRNA predictions and cloning results as a function of the virus genome size.
figure 1

The horizontal axis represents the sizes of the viral genomes studied. Circles correspond to the number of predicted pre-miRNAs for a particular virus and squares to the total number of cloned miRNAs. The number of predicted pre-miRNAs that were verified by cloning is indicated in parentheses.

Experimental isolation of new viral miRNAs

Guided by our computational predictions we cloned and sequenced small RNAs from cells infected with HCMV, KSHV and the KSHV-related MHV6811. Additionally, we studied the population of small RNAs isolated from cells infected with small-genome RNA viruses, namely YFV, HCV and HIV1. We used our previously established small-RNA cloning protocol12 for the preparation of YFV, HCV and HIV1 small-RNA libraries and a modified version for the preparation of KSHV, CMV and MHV68 libraries. The modified cloning protocol required the use of the N-terminal domain of T4 RNA ligase 2, Rnl2(1–249) ligase13, for joining a pre-adenylated 3′ adapter oligonucleotide to the gel-purified small RNAs. Rnl2(1–249) ligase, like T4 RNA ligase 1, can join a preadenylated 3′ adapter oligonucleotide to the 3′ hydroxyl–containing small RNA molecules, but the undesired circularization of small RNAs caused by adenylate transfer from the 3′ adapter to the 5′ phosphate–containing small RNAs is notably less than that seen with T4 RNA ligase 1. We then sequenced at least 1,000 independently cloned small RNAs from each library. Annotation of the cloned RNAs revealed a viral content ranging from 0 to 21% depending on the virus studied (Supplementary Table 2 online).

We identified viral miRNAs as the sequences that were cloned multiple times and that mapped perfectly to the viral genome at a location adjacent to an approximately 30-nt highly complementary sequence, which gives rise to the pre-miRNA (Figs. 2, 3 and 4). In some instances, the existence of such a dsRNA precursor was directly confirmed by cloning small RNAs from both arms of the pre-miRNA (Figs. 2b and 4b). We identified ten KSHV miRNAs by cloning from the latently KSHV-infected body cavity–based B lymphoma BCBL1 cell line14 (Fig. 2b and Supplementary Table 3 online). Our computational method predicted a total of eight pre-miRNAs, five of which overlap with the experimental set (Supplementary Table 1 online). Furthermore, we identified nine MHV68 miRNAs by cloning small RNAs from latently MHV68-infected mouse B lymphoma S11 cells15 (Fig. 3b and Supplementary Table 4 online). From four predicted pre-miRNAs that were confined to the (+) and (−) strands of only two genomic locations, we experimentally confirmed one miRNA. But we observed that another four of the cloned miRNAs had a score immediately below the prediction threshold owing to an unusual feature caused by their pol III tRNA promoters. Finally, we identified nine HCMV miRNAs cloned from small RNAs isolated from lytically infected primary human fibroblasts (Fig. 4b and Supplementary Table 5 online). We cloned five miRNAs from 11 predicted miRNA genes (Supplementary Table 1 online).

Figure 2: Genomic positions and predicted secondary structure of KSHV pre-miRNAs.
figure 2

(a) Diagram of the miRNA-containing fragment of the KSHV genome. Terminal repeats (TR) are shown as black boxes. DR1 and DR2 indicate (G+C)-rich direct repeats 1 and 2. Cloned miRNAs are shown as dotted lines, predicted miRNAs as gray rectangles, the two alternate promoters for kaposin transcripts as black arrows, the kaposin transcripts as black lines, and the intronic region in the larger transcript as a break in the line. The large open arrows indicate known ORFs. The thick gray arrows below show ORFs for kaposin isoforms A, B and C. (b) Predicted KSHV pre-miRNAs. The predominantly cloned miRNAs are indicated with a black bar, asterisks highlight low-abundance small RNAs, and the black arrow indicates the A-to-I edited residue. (c) Northern blots for KSHV-miR-K12-10a, miR-K12-10b, miR-K12-4, miR-K12-3 and miR-K12-1 from total RNA isolated from a KSHV-negative cell line (BJAB) and from BCBL1 cells at the indicated times after TPA treatment. The expression of human miR-21 was also examined for reference. The numbers below the miR signal indicate relative signal intensity with respect to the 0 h signal after normalizing for gel loading by using the tRNA signal.

Figure 3: Genomic positions and predicted secondary structures of MHV68 pre-miRNAs.
figure 3

(a) Diagram of the miRNA-containing locus of the MHV68 genome. Terminal repeats (TR) are shown as black boxes. Cloned miRNAs are shown as dotted lines, predicted miRNAs as gray rectangles, and previously characterized tRNA-like molecules19 as black arrows. The large open arrows indicate known ORFs. (b) Examples of predicted fold-back of MHV68 pre-miRNA primary transcripts. The tRNA is on the left, and a black bar indicates the predominantly cloned miRNAs. (c) MHV68 miRNAs are expressed from the tRNA primary transcripts. Northern blot for MHV68-miR-M1-1, tRNA1, miR-M1-5, tRNA4, miR-M1-7-5p and tRNA5 from total RNA isolated from the MHV68-negative A20 cell line (−) and from the MHV68-positive S11 cell line (+). M, 5′ radiolabeled 25-base-pair DNA ladder.

Figure 4: Genomic positions and predicted secondary structures of HCMV pre-miRNAs.
figure 4

(a) Diagram of the miRNA-containing fragments of the HCMV genome. Cloned miRNAs are shown as dotted lines and predicted miRNAs as gray rectangles. The miRNAs encoded from the (+) strand of the genome are shown above the genome and those deriving from the (−) strand of the genome are below. The large open arrows indicate the orientation of the viral ORFs. (b) Predicted HCMV pre-miRNAs. The predominantly cloned miRNAs are indicated with a black line bracket. An asterisk is used to highlight low-abundance small RNAs.

As expected from our computational prediction, we did not experimentally identify miRNAs derived from HCV, and identified only two and four different small RNAs in (+)-strand orientation from HIV1 and YFV, respectively (Supplementary Table 6 online). The corresponding genomic regions surrounding the cloned RNAs were not predicted to form stable base-paired structures, suggesting that these small RNAs are random degradation products of the turnover of longer viral transcripts. These indirect observations are consistent with the idea that HCV, YFV and HIV1 do not elicit an RNA interference (RNAi) response. Alternatively, these viruses could have evolved inhibitors of RNAi. In the case of HCV, we examined if infected cells were still able to carry out RNA silencing using synthetic siRNAs or hairpin RNA against the lamin A/C gene. Infected and uninfected cells supported lamin A/C gene silencing with equal efficiency, arguing against the expression of a viral inhibitor of RNAi (Supplementary Fig. 1a online). Consistent with the unperturbed RNA silencing, northern blot analysis of several host cell miRNAs revealed no differences between infected and noninfected cells (Supplementary Fig. 1b online).

Features of the KSHV miRNAs

All the cloned and predicted KHSV miRNAs are confined to a 5-kb region of the 140-kb viral genome, which also encodes the transforming protein–coding kaposin gene16. Two alternate transcripts for kaposin have been described17,18, which each give rise to three possible isoforms, A, B and C, of the kaposin protein depending on the translation codon used18 (Fig. 2a and Supplementary Table 3 online). KSHV-miR-K12-10 is located within the open reading frame (ORF) K12 so that its excision coincides with the destruction of a fraction of the kaposin mRNA. The other KHSV miRNAs are located within the intronic region of the larger kaposin primary transcript (Fig. 2a). But, as it is unclear if the large kaposin transcript is expressed in BCBL1 cells17, it is also conceivable that these miRNAs are expressed from a promoter located immediately upstream of the miRNA cluster.

We also noticed two approximately equally frequently cloned nucleotide sequence variants of KSHV-miR-K12-10, which differ in having either an A or a G at position 2 of the cloned sequence (Fig. 2b and Supplementary Table 3 online). We refer to the A-containing sequence as KSHV-miR-K12-10a and the G-containing sequence as KSHV-miR-K12-10b. Database analysis identified perfect matches of KSHV-miR-K12-10a and KSHV-miR-K12-10b to different viral isolates. When we PCR-amplified and sequenced ten clones from the miRNA-containing segment of the BCBL1-borne KSHV genomic DNA, we were able to recover perfect matches only to KSHV-miR-K12-10a, suggesting that KSHV-miR-K12-10a pre-miRNA is partially edited by a dsRNA-specific deaminase that converts the adenosine at position 2 into inosine. This editing event would also change a glycine to a serine in the kaposin A and C proteins. To assess the extent of A-to-I conversion in the kaposin mRNA/pre-KSHV-miR-K12-10 transcripts, we prepared a poly(A)-specific cDNA library from our BCBL1 cells. Sequencing of 14 clones of PCR-amplified kaposin mRNA/pre-KSHV-miR-K12-10 segment revealed 12 A-to-I substitutions.

BCBL1 cells harbor replication-competent KSHV, and upon treatment with tetradecanoyl phorbol acetate (TPA) these cells undergo the complete program of KSHV gene expression, resulting ultimately in viral replication and the release of mature virions14. We asked whether KSHV miRNAs might be differentially expressed upon induction of the virus lytic cycle. We isolated total RNA from BCBL1 cells at various times after TPA treatment and examined KSHV miRNA expression by northern blot analysis. We found that KSHV-miR-K12-10 expression was increased up to 6.7-fold in TPA-induced cells relative to untreated cells, whereas the expression of the miRNAs confined to the intronic region was unaffected (Fig. 2c).

Features of the MHV68 miRNAs

To address evolutionary relationships in the gammaherpesvirus family, we studied MHV68, which is the closest relative to human EBV and KSHV11. All of the nine cloned miRNAs cluster within a 6-kb region (Fig. 3a). This region also contains eight previously noted tRNA genes, which, surprisingly, were not aminoacylated19.Because all of the cloned MHV68 miRNAs are located immediately downstream of these tRNA sequences (Fig. 3a), it suggests that the pre-miRNAs are transcribed by pol III. The pol III tRNA promoter of a conventional tRNA gene consists of an A box (positions +8 to +19) and a B box (positions +52 to +62), followed by a run of oligo(T) as a terminator signal20. Secondary structure prediction of the putative tRNA-pre-miRNA primary transcripts (Fig. 3b) indicates that one or two 20–25-base-pair pre-miRNA hairpins follow the tRNA structure. In each case a run of oligo(T) is identified at the end of the first or the second pre-miRNA. Sequence analysis indicated that the eight tRNA promoters may express a total of 14 distinct pre-miRNAs (Supplementary Table 4 online), 9 of which we identified by cloning.

To confirm that MHV68 miRNAs are embedded in the pol III tRNA primary transcripts, we performed a northern blot analysis on the three most abundant MHV68 miRNAs using probes matching to either the mature miRNA or the first 20 nt of the upstream tRNA (Fig. 3c). The probes complementary to miR-M1-1, miR-M1-5 and miR-M1-7-5p detected bands of 20 nt, 60 nt and 200 nt (Fig. 3c, panels 1, 3 and 5). The 200-nt band was also detected with probes specific for the upstream tRNAs (Fig. 3c, panels 2, 4 and 6) in addition to the 75-nt band of the mature tRNA (Fig. 3c, panels 4 and 6).

Features of the HCMV miRNAs

HCMV is a member of the betaherpesvirus subfamily21. In contrast to the viruses described above that were studied during their latency, we cloned the HCMV small RNAs from RNA isolated from lytically infected primary cells. The proportion of cloned rRNA degradation was notably higher than with the libraries we prepared from latently infected cells, which may be due to an increased RNA turnover in apoptotic cells (Supplementary Table 2 online). We cloned 424 small RNAs that mapped to the virus genome; 60% of these sequences were cloned multiple times and corresponded to nine miRNAs. In contrast to KSHV and MHV68, the HCMV miRNAs are spread across the viral genome (Fig. 4a). Three of the nine miRNAs are transcribed from the complementary strand of known ORFs, five miRNAs are located in intergenic regions and one is located within an intron (Fig. 4a). The remaining sequences were cloned mostly once, sometimes twice (Supplementary Table 6 online), and did not represent miRNAs but were rather random degradation products of the abundant viral transcripts.

Discussion

We developed a new prediction algorithm that provides a reliable starting point for guiding experimental identification of viral miRNAs. In contrast to other miRNA prediction software, our program does not rely on evolutionary conservation to identify miRNA genes. Consistent with low prediction scores, we were unable to identify miRNAs by cloning in several small-genome RNA viruses, but we confirmed many predicted miRNA genes by cloning the miRNAs from large-genome DNA viruses.

We determined the sensitivity (fraction of cloned miRNAs for which we predicted precursors) and specificity (fraction of predictions for which mature miRNAs were cloned) of our prediction method for different values of the prediction threshold for all the viruses for which we obtained cloning data (Supplementary Fig. 2 online). For three of the four viruses from which we cloned miRNAs, the specificity increased markedly when the score threshold reached a value of 0, still keeping the sensitivity above 40%. Thus, the threshold score that we set for prediction based on the training sets provides a good tradeoff between sensitivity and specificity for the virus data. Using this score threshold we predicted the location of 13 (39%) of the 33 precursors of cloned miRNAs, and we did not predict miRNA precursors in the large genomes of HHV3, HHV6 and HHV7. This recovery rate for viral miRNAs is lower than for the training set, in part because of some unusual features of viral miRNAs, such as the relatively short stems of the pol III–transcribed MHV68 miRNAs. But MHV68 was still predicted to contain miRNAs with a probability of 97.6%. In contrast, the probability of encoding miRNAs was considerably lower for the similar-sized genomes of HHV3, HHV6 or HHV7 because all of the stem-loops predicted from these viruses had relatively low scores. If we disregard the direction of transcription, we made 23 unique predictions, of which 14 (61%) were experimentally confirmed. It is conceivable that some predicted miRNAs were undetectable because they may be under the control of promoters that are only active at one of the various latent or lytic stages of infection or depend on the specific infected cell type. For example, EBV, KHSV and MHV68 can infect both B cells and epithelial cells11,22,23, which have distinct gene expression programs.

In many organisms, RNA silencing is part of an innate immune response against RNA viruses (reviewed in refs. 24 and 25) and transposable elements26,27,28. In response, counter-defense strategies to thwart the host response were found in plant viruses, the insect Flock house virus and were proposed for vaccinia and influenza viruses (reviewed in ref. 25). But we could not detect any viral siRNA in HCV-, YFV- or HIV1-infected cells, and showed that HCV does not inhibit RNAi of a cellular gene. Thus, it is likely that human RNA viruses effectively hide dsRNA replication intermediates and do not interfere with the RNA silencing machinery.

The human gammaherpesviruses EBV and KSHV pose an important health problem and are closely associated with various types of cancer22,23. The new discovery of miRNAs may shed some light on the mechanism of herpesvirus pathogenesis. All of the KSHV miRNAs were found in a locus previously reported to encode transforming properties16,29. The most interesting segment of this transforming region is the kaposin gene, which is expressed in three different isoforms, kaposin A, B and C, depending on the translation initiation codon used18 (Fig. 2a). Notably, KSHV-miR-K12-10 is located within the kaposin mRNA transcript, in a position overlapping the K12 ORF. Excision of the K12-10 pre-miRNA presumably abolishes protein expression from the remaining fragments, which are missing the cap or poly(A) tail. Similar to the EBV BHRF1 gene and its miRNAs, whose expression increases along with lytic replication, kaposin and its embedded KSHV-miR-K12-10 are upregulated by induction of the lytic stage. This indicates that viruses can express both noncoding and coding sequences simultaneously, and presumably, depending on the expression levels, a certain fraction of transcripts may escape miRNA processing to then serve as template for protein synthesis.

The KSHV-miR-K12-10 is edited at position 2 of the mature sequence. It is unclear whether this editing has regulatory function, considering that the A-to-I change is within the region of an miRNA shown to be the most critical region involved in target recognition30. The sequence and structural context of the edited A is optimal for recognition by adenosine deaminases acting on RNA, which prefer a 5′ uracil and a mismatched cytosine on the opposite strand31. Editing of a cellular miRNAs has been reported previously, but at a substantially lower frequency than observed for this viral miRNA32.

Analysis of many primary transcripts of miRNAs indicated that they generally originate as pol II transcripts3, which permits tissue-specific and developmental regulation of miRNA transcripts by the same mechanisms as govern the expression of protein-coding genes. It was therefore surprising to identify miRNA transcription of MHV68 miRNAs initiated from a tRNA-specific pol III promoter, which is largely an unregulated promoter active in any cell type. The discovery that tRNA promoters drive MHV68 miRNA expression explains the puzzling detection of uncharged viral tRNAs in germinal centers of MHV68 latently infected mice19, and suggests that these tRNA-like molecules are nonfunctional remnants from the expressed miRNA promoter region. Furthermore, the MHV68 pre-miRNAs have an unusual short-hairpin structure, suggesting that their processing and nuclear export mechanism may also be distinct from host cell miRNA maturation.

Almost all of the identified viral miRNA genes are located within regions of the genome that are specific for each virus analyzed, and the cloned miRNAs share no sequence similarity among themselves or with host cell miRNAs, suggesting a short and independent evolutionary history. Only HCMV miRNAs seem to be partially conserved with respect to the chimpanzee CMV, but not the more distant mouse CMV. The detection of unusual pol III–driven miRNAs further supports our hypothesis that viral miRNAs evolved independently of each other.

The cellular targets for most of these viral miRNAs remain to be identified. Those miRNAs that are generated from a strand opposite a protein-coding segment may act like siRNAs in regulating the level of viral protein. CMV-miR-UL112-1 might target UL114, a homolog of the mammalian uracyl-DNA glycosylase. UL114 is required for efficient viral DNA replication33, suggesting that the miRNA expression may control viral replication. A similar function has previously been attributed to a miRNA expressed in EBV7. Because there is no substantial conservation between viral miRNAs or between viral miRNAs and the host cell miRNAs, the computational identification of regulated host or viral genes will be challenging.

Methods

miRNA predictions.

One essential feature of pri- and pre-miRNAs appears to be a stem-loop secondary structure that is recognized at various steps of biogenesis. Therefore, in the first step of our algorithm we identified genomic regions (50–100 nt long) that, when transcribed, are predicted to assume the same stem-loop secondary structures, independent of the precise start and end of the transcript. The detailed calculation of conserved stem-loops is described in the Supplementary Methods online.

We then used the SVM methodology to predict new miRNA precursors among a set of conserved stem-loops. The SVM requires that each data point (in our case a stem-loop) be described in terms of a vector of features, and we included the free energy of folding, the length, the length of the longest symmetrical stem, the count of A, C, G and U nucleotides in the symmetrical stem, and the number of A-U, G-C and G-U pairs in the predicted minimal energy structure, among other criteria. We trained the classifier using conserved stem-loops from human miRNA loci as positives and sequences that are not likely to be processed by the RNAi machinery—mRNAs, rRNAs, tRNAs, random regions of the human genome and viral genomes—as negatives. Although our data sets of positives and negatives are not linearly separable, using a quadratic kernel and penalizing misclassification of positives eight times more than the misclassification of negatives allowed to obtain a model that assigns a positive score to 71% of the true positives and to only 3% of false positives.

Given the distributions of scores for the set of positive and negative examples, and the scores for the stable stems identified in a given genome, we estimated the minimal number of miRNA precursors expected in each genome as follows. We assume that the probability p that a conserved stem contains a miRNA is only a function of the score s assigned to the stem by the SVM. We expect that the function p(s) has a step-like monotonic behavior, being close to zero for negative values of s and becoming asymptotically close to 1 for large positive scores. To concretely specify p(s), we first choose a suitable parametric functional expression (for instance, p(s) = 0.5 × (1 + tanh(as + b)), in which a and b are the parameters) and then fix its parameters by maximizing the likelihood of the training set, defined as Πi positives p(si) × Πj negatives (1 − p(sj)). The expression for the probability of having exactly m miRNAs is P(m) = Σall m-tuples of candidates p(s1)...p(sm) × Πi m-tuple (1 − p(si)). The fifth column in Table 1 contains the quantity Σm ≥ 1 P(m), which corresponds to the probability of finding at least one miRNA precursor in the set of candidates.

RNA preparation and small-RNA cloning.

Total RNA extraction and cloning was performed as described previously12 with the following modifications for the KSHV, MHV68 and HCMV libraries. We spiked 200 μg total RNA with 0.2 nM radiolabeled 19- and 24-nt oligoribonucleotides containing the PmeI restriction site (5′-CGUACGCGGGUUUAAACGA-3′ and 5′-CGUACGCGGAAUAGUUUAAACUGU-3′), size-fractionated and ligated to a 5′-adenylated 3′ adapter oligonucleotide (5′-AppTTTAACCGCGAATTCCAG-L-3′; Ap, Adenylate; p, phosphate; L, C6-amino linker (Dharmacon)) using the Rnl2(1–249) ligase13 in the absence of ATP. The reaction was performed in a final volume of 20 μl containing 0.1 μg of the Rnl2(1–249) ligase, 5 μM of the 3′ adapter oligonucleotide, 0.01 M MgCl2, 0.01 M 2-mercaptoethanol, 0.05 M Tris-HCl (pH 7.6) and 0.1 mg acetylated bovine serum albumin. After gel purification, the ligation product was directly joined to the 5′ adapter chimeric DNA/RNA oligonucleotide (5′-ACGGAATTCCTCACTrArArA-3′; r, ribonucleotide) using T4 RNA ligase 1 (NEB) in the presence of ATP and in the same reaction solution as above. The final ligation product was reverse transcribed using the primer 5′-GACTAGCTGGAATTCGCGGTTAAA-3′ and PCR amplified using the reverse transcription primer and the primer 5′-CAGCCAACGGAATTCCTCACTAAA-3′. After purification, the PCR product was digested with PmeI to eliminate size marker sequences and an additional PCR was performed to introduce the nonpalindromic BanI restriction sites with primers 5′-CAGCCAACAGGCACCGAATTCCTCACTAAA-3′ and 5′-GACTAGCTTGGTGCCGAATTCGCGGTTAAA-3′. The BanI-digested PCR products were then concatemerized using T4 DNA ligase and later ligated into the commercial T/A cloning vector pCR2.1 (Invitrogen). Individual colonies were screened for the presence of inserts by PCR and the insert-containing PCR products were sequenced as described elsewhere7.

Note: Supplementary information is available on the Nature Methods website.