Sweet potato viromes in eight different geographical regions in Korea and two different cultivars

The sweet potato in the family Convolvulaceae is a dicotyledonous perennial plant. Here, we conducted a comprehensive sweet potato virome study using 10 different libraries from eight regions in Korea and two different sweet potato cultivars by RNA-Sequencing. Comprehensive bioinformatics analyses revealed 10 different virus species infecting sweet potato. Moreover, we identified two novel viruses infecting sweet potato referred to as Sweet potato virus E (SPVE) in the genus Potyvirus and Sweet potato virus F (SPVF) in the genus Carlavirus. Of the identified viruses, Sweet potato feathery mottle virus (SPFMV) was the dominant virus followed by Sweet potato virus C (SPVC) and SPVE in Korea. We obtained a total of 30 viral genomes for eight viruses. Our phylogenetic analyses showed many potyvirus isolates are highly correlated with geographical regions. However, two isolates of SPFMV and a single isolate of Sweet potato virus G (SPVG) were genetically distant from other known isolates. The mutation rate was the highest in SPFMV followed by SPVC and SPVG. Two different sweet potato cultivars, Beni Haruka and Hogammi, were infected by seven and five viruses, respectively. Taken together, we provide a complete list of viruses infecting sweet potato in Korea and diagnostic methods.


Results
Collection of leaf samples for sweet potato virome study. In order to identify viruses infecting sweet potato in Korea, we collected leaf samples from eight different geographical regions in 2016 and 2017 (Table 1 and Fig. 1a). All sweet potato plants were grown in fields, as shown in Fig. 1b. We collected leaf samples showing disease symptoms (Fig. 1c,d). Samples were pooled based on the collected regions. In addition, we collected two representative cultivars referred to as "Beni Haruka" and "Hogammi, " which were widely cultivated in Yeoju, Korea. A total of 357 samples were subjected to RNA-Seq. Ten different libraries for RNA-Seq were prepared and pairedend sequenced by the HiSeq2000 system. For simplicity, we named the libraries based on geographical regions and cultivar names ( Table 2). For instance, a library from "Hogammi" cultivated in Yeoju was referred to as YJ-H. Transcriptome assembly and virus identification. Raw sequence reads from 10 libraries were individually de novo assembled using the Trinity program. The number of assembled contigs (transcripts) ranged from 65,866 (HN) to 187,838 (NS) ( Table 3). The contigs obtained from each library were used for a BLASTN search against the plant virus reference genome sequences derived from the viral genome database (https://www.ncbi. nlm.nih.gov/genome/viruses/). Based on the BLASTN search using assembled contigs, we obtained a total of 646 virus-associated contigs representing 10 different sweet potato viruses from 10 sweet potato transcriptomes (Table 4). Based on the number of virus-associated contigs, SPFMV (249 contigs) was the dominant virus infecting sweet potato followed by SPVC (129 contigs), SPLCV (78 contigs) (Fig. 2a), while based on virus-associated reads, SPVC (202,975 reads) was the major virus infecting sweet potato followed by SPFMV (198,658 reads) and SPVE (Sweet potato virus E) (178,532 reads) (Fig. 2b). We examined the proportion of identified viruses in each library based on virus-associated contigs (Fig. 2c) and reads (Fig. 2d). According to virus-associated contigs, SPFMV was the major virus in seven libraries including HN, IC, IS, NS, YJ, YJ-B, and YJ-H (Fig. 2c). Based on virus-associated reads, SPVC was the major virus in four libraries including GJ, GS, HN, and IS, while SPFMV was the dominant virus in NS, YJ-B, and YJ-H (Fig. 2d). SPVE and SPVG were the dominant viruses in IC and YJ, respectively. From the YA library, only SPSMV was identified.  Table 1. Detailed information of sweet potato samples for sweet potato viromes in Korea. Geographical regions were abbreviated and used for library names. Sweet potato leaf samples were pooled according to geographical regions. The sweet potato cultivar names of mixed samples were unknown. Two known sweet potato cultivars, "Beni Haruka" and "Hogammi, " grown in Yeoju were used. Samples were collected in May 2016 and May 2017.   (Table S1 and Fig. 3d). We identified two novel viruses tentatively named SPVE and Sweet potato virus F (SPVF) from the sweet potato transcriptomes (Fig. 3). The SPVF isolate GS with a single-stranded RNA genome of 9,122 nucleotides (nt) was assembled from the GS library. SPVF isolate GS contained six open reading frames (ORFs) encoding RNA-dependent RNA polymerase (RdRp), three triple gene block (TGB) proteins, one coat protein (CP), and one nucleic acid-binding protein (NABP) (Fig. 3e). A BLASTN search against the nucleotide (NT) database of the National Center for Biotechnology Information (NCBI) revealed that SPVF isolate GS shared 67% coverage and 76% nucleotide identity with the SPCFV isolate UN210 (KP115607.1) identified from Korea (Fig. 3f). The phylogenetic trees based on RdRp (Fig. 3g) and CP (Fig. 3h) amino acid sequences showed that SPVF belongs to the same clade as SPCFV, which is a member of the genus Carlavirus in the family Betaflexiviridae. We identified SPVF from the GS and IC libraries; however, the genome of SPVF isolate IC was incomplete. The phylogenetic tree using genome sequences for SPVF isolates GS and IC and three SPCFV isolates showed that two SPVF were grouped together. Species demarcation criteria for the genus Carlavirus are 72% nt identity or 80% aa identity between their respective CP or polymerase genes. RNA-dependent RNA polymerase (RdRp) and coat protein (CP) for SPVF showed 78.41% and 81.89% amino acid identity, respectively, demonstrating that SPVF is a new species in the genus Carlavirus (Table S1 and Fig. 3i).  www.nature.com/scientificreports www.nature.com/scientificreports/ Complete genomes of viruses infecting sweet potato and phylogenetic analyses. From sweet potato transcriptomes, we assembled genomes of several viruses infecting sweet potato. We assembled 12 nearly complete genomes covering ORFs for seven viruses including SPFMV, SPLCV, SPLV, SPVC, SPVE, SPVF, and SPVG (Table 5). Except for SPLCV and SPVF, all viruses belonged to the genus Potyvirus. In addition, we obtained several partial sequences for the identified sweet potato viruses.

Name of library
In order to reveal the phylogenetic relationships of the identified viruses, we generated phylogenetic trees based on genome sequences (Table S2 and Fig. 4). The phylogenetic tree of SPFMV showed two well-defined groups, Group A and Group B (Fig. 4a). Group A contained three SPFMV isolates, IS, YJ, and IC, while Group B included a single SPFMV isolate GS. Interestingly, two SPFMV isolates, YJ-H and YJ-B, were genetically distant from other known SPFMV isolates. According to the phylogenetic tree, SPFMV isolate YJ-B is a common ancestor of known SPFMV isolates. Genome sequences of seven SPVC isolates in this study were used for phylogenetic tree construction (Fig. 4b). The phylogenetic tree of SPVC showed three defined groups. All SPVC isolates in this study belonged to Group A containing isolates from Korea, China, and Australia. The phylogenetic tree of SPVG  (i) Maximum likelihood phylogenetic tree of genome sequences for SPCFV, SPYMV, and SPVF. EBCVA was used as an outgroup. Two genome sequences for SPVF isolates GS and IC were included in the phylogenetic construction. For the phylogenetic tree construction, all available protein or genome sequences homologous to SPVE or SPVF were retrieved from GenBank based on BLASTP and BLASTN searches, respectively. Accession number, isolate name, and virus name were described. Orange color indicates SPVE or SPVF. We used bootstrap replication values of 1,000, and bootstrap values over 70% are shown. (2020) 10:2588 | https://doi.org/10.1038/s41598-020-59518-x www.nature.com/scientificreports www.nature.com/scientificreports/ revealed two groups of SPVG isolates (Fig. 4c). According to the phylogenetic tree, two SPVG isolates, IS and WT325, in Group B were distantly related to those in Group A. The phylogenetic tree using genome sequences of SPLV isolates revealed that three SPLV isolates were very closely related (Fig. 4d). In the case of SPV2, all four SPV2 isolates in this study were closely related (Fig. 4e). The phylogenetic tree of SPLCV displayed three defined groups (Fig. 4f). Group A included two SPLCV isolates, YJ and YJ-B, whereas Group C contained only SPLCV isolate GS. In particular, SPLCV isolate IC was genetically distant from other SPLCV isolates.

Mapping of raw sequence reads and mutation rates of identified viruses. Viruses evolve faster
than other organisms and exhibit strong genetic variation within the infected host. To determine virus mutations, we mapped raw sequence reads on the 12 virus genomes in which nearly complete genomes were assembled from this study (Figs. 5 and 6). Using SAMtools, we conducted variant calling for individual virus genomes resulting in the identification of single nucleotide polymorphisms (SNPs).
Firstly, we examined the mapping patterns of sequence reads on the individual virus. Many reads were mapped on the whole region of the virus without gaps, enabling us to obtain the nearly complete genome (Figs. 5 and 6). Most viruses such as SPVF, SPVG, SPLV, SPFMV, and SPVC showed that the number of mapped reads increased from the 5′ region to the 3′ region of the virus genome (Figs. 5 and 6). In particular, SPLCV DNA A having a circular DNA genome exhibited different mapping patterns from other viruses. For example, a relatively small number of reads were mapped on the 5′ and 3′ regions and the region between AV1 and AC3 (Fig. 5d). The number of mapped reads was reduced from the AC3 region to the AC4 region.
We next examined the number of SNPs in each virus genome (Fig. 6g). Three SPFMV isolates showed a large number of SNPs ranging from 512 (Isolate YJ) to 1,036 (Isolate YJ-B) followed by three SPVC isolates ranging from 345 (Isolate IS) to 399 (Isolate GS). SPLV isolate YJ-B had the smallest number of SNPs among the 12 virus genomes. We calculated the frequency of SNPs according to each virus genome (Fig. 6h). Again, three SPFMV isolates showed a high frequency of SNPs ranging from 4.71% (Isolate YJ) to 9.67% (Isolate YJ-B). SPVE, SPVF, and SPVG showed low frequencies of SNPs.
We examined the positions of SNPs in each virus genome (Figs. 5 and 6). In general, the identified SNPs were randomly distributed on the virus genome. In particular, two SPVG isolates had many SNPs on the 3′ end region of the virus.

Development of molecular methods to diagnose viruses infecting sweet potato. In this study,
we identified a total of 10 different viruses infecting sweet potato. Based on the virus-associated reads in each library, SPCFV (16 reads) was identified from only one library (IS), while SPFMV (198,658 reads) was identified from eight libraries ( Table 6). It is important to confirm RNA-Seq results and to develop molecular diagnostic methods for viruses infecting sweet potato. For that, we designed primer pairs for 10 viruses. RT-PCR primers for nine viruses were designed for RNA viruses with a single RNA genome, while PCR primers were designed for SPLCV with a single circular DNA genome based on the obtained virus sequences (Table S2 and Fig. 7a). A primer pair amplifying a partial actin gene was used as a positive control. We extracted total RNA and DNA from the same samples used for RNA-Seq. Amplified PCR products from each sample were visualized by gel electrophoresis (Figs. 7b and S1). In general, PCR results were correlated with those of RNA-Seq (Fig. 7b). For example, SPFMV was identified from nine libraries (all except YA) by RT-PCR, whereas SPFMV was identified from eight libraries (all except GS and YA) by RNA-Seq (Fig. 7b). In general, the PCR and RNA-Seq results were identical for SPV2, SPLCV, SPVF, SPSMV, and SPLV. In the case of SPVC, SPVE, and SPCFV, RT-PCR identified an additional virus infection compared to RNA-Seq. Based on genome sequence and phylogenetic analyses, SPVG isolate IS was distantly related with the other SPVG isolates. Therefore, we designed an additional primer pair for SPVG isolate IS. The primer pair for SPVG IS with a size of 702 bp could amplify SPVG isolate IS with high specificity.

Discussion
The word "virome" is derived from two words, "virus" and "genome," which suggests it is similar to a viral genome 28 . Strictly speaking, a virome should be defined as all virus-associated nucleic acids existing in a specific organism, tissue, or environment 29 . A virome study should include viral genome information for not only a single virus but also multiple viruses. Moreover, a virome study reveals not only the presence of viruses but also the complexity of viral genomes, viral replication, viral mutation, and the change of viral genomes in a certain condition, as shown previously 24,26,27 .
Research associated with viruses infecting sweet potato has been intensively conducted in many countries 14,17,23,30 . However, there have been few comprehensive studies associated with sweet potato viromes. For instance, a previous study examined sweet potato viromes in three different sweet potato cultivars in China by NGS, revealing a total of 15 different viruses infecting sweet potato 21 . In addition, a large-scale sweet potato virome study is currently being conducted by the small RNA-Seq of 1,750 sweet potato samples collected from 12 countries in Africa, where the sweet potato is one of the main crops (http://bioinfo.bti.cornell.edu/virome/index).
In Korea, several studies have been carried out to understand viruses infecting sweet potato. For example, a nationwide survey was carried out to examine viruses infecting sweet potato from 2011 to 2014 14 . Based  www.nature.com/scientificreports www.nature.com/scientificreports/ on multiplex RT-PCR assays, there were at least eight different viruses infecting sweet potato in Korea 16 . Furthermore, viral genomes of 18 isolates representing five potyviruses in Korea were determined 15 . As compared to previous studies associated with viruses infecting sweet potato in Korea, our sweet potato virome study provides valuable additional information associated with viruses infecting sweet potato in Korea. We revealed a total of 10 different sweet potato viromes representing eight different geographical regions and two major sweet potato cultivars in Korea. Our study demonstrated that each geographical region and cultivar displayed a unique sweet potato virome composed of different viruses, as shown in other plant viromes 27,31 . For instance, three viruses were identified from the GJ, HN, and NS regions; however, SPVC was dominantly present in GJ and HN, while SPFMV was the dominant virus in NS. The high proportion of SPLCV, which is transmitted by the whitefly, in the NS region compared to other regions suggests that insect vectors could be environmental factors determining the individual sweet potato virome. Samples for YJ-B and YJ-H were collected from the same region, Yeoju; however, the lists and proportions of infected viruses were very different, showing cultivar-specific sweet potato viromes.
Here, we identified a total of 10 different virus species infecting sweet potato. Our RNA-Seq-based approaches identified novel viruses and new virus variants more successfully than a PCR-based viral genome study, which could amplify highly conserved viral sequences 15 . In particular, the rare similarity of the 5′ regions in SPVE and SPVF with other homologous viruses highlights the superiority of NGS techniques followed by bioinformatics analyses. Although SPVF in the genus Carlavirus was closely related to SPCFV, the BLAST results and phylogenetic analyses revealed that SPVF is a new species. Two SPVF isolates, GS and IC, in the same group were different from other SPCFV isolates. Moreover, our phylogenetic analyses demonstrated that Sweet potato yellow mottle virus (SPYMV) isolate Yeongdeok (KR072674.1) should be a member of SPCFV. Moreover, two SPVE isolates, GS and YJ-B, were closely related to SPVC. However, the phylogenetic tree using polyprotein sequences of potyviruses infecting sweet potato demonstrated that SPVE was a new species in the genus Potyvirus.
Our sweet potato virome study indicated that SPFMV was the dominant virus followed by SPVC and SPVE in the examined regions in Korea. In addition, these three viruses are regarded as the main viruses infecting sweet potato in the world. We identified all previously reported viruses infecting sweet potato in Korea except Sweet potato golden vein-associated virus (SPGVaV) in the genus Begomovirus, which is rare in Korea 16,32 . Moreover, we did not identify SPCSV in the genus Crinivirus, which has been reported in many countries including China and Uganda 33,34 . Therefore, it is important to prevent the introduction of SPCSV to Korea.
Plant tissues and developmental stages are important factors for plant virome studies. Our previous studies demonstrated that viral RNA was enriched in grape fruits and lily flowers 24,35 . Of course, the optimal tissues for plant virome studies depend on the plant species. In our study, the proportion of virus-associated reads in the whole transcriptome was very low, indicating that leaf tissues might not be appropriate samples for the detection of viruses infecting sweet potato. As shown in a previous study 36 , we carefully suggest using samples from the fibrous and tuberous roots of sweet potato for virus detection. Based on our experience studying viruses infecting plants with a large genome size such as the hexaploid sweet potato (4.4 Gb) 36 , selection of the proper tissue enriched with viruses is necessary for a successful virome study.
In this study, we used messenger RNA from total RNA for the library preparation using oligo-d(T). Eight RNA viruses in this study possessed polyadenylate (poly(A)) tails (all except SPLCV and SPSMV). As shown in other previous studies 21,35 , virus genomes with poly(A) tails can be easily assembled. It is also not surprising that genomes of DNA viruses such as SPLCV can also be assembled from transcriptome data, as shown in our previous studies 26 . In many cases, the number of mapped sequence reads on the viral genomes with poly(A) tails was increased from the 5′ region to the 3′ region. This result is somehow correlated with the result of poly(A) tails preferentially attaching to the transcripts close to the poly(A) tails.
We examined the mutation rates for the 12 assembled virus isolates. Of them, SPFMV showed a high mutation rate of up to 9.67%. Furthermore, a recent study has found possible recombination within the Nla-Pro, CP, and P1 genes using available SPFMV genomes 30 . In addition, SPLCV also exhibited a high frequency of mutations of up to 4.52%. Similarly, several Korean SPLCV recombinants have been identified 13 . Thus, our result suggests that mutation and recombination contribute to the genetic diversity of SPFMV and SPLCV isolates. Although several potyviruses were coinfected, the mutation rate was the highest in SPFMV followed by SPVC, SPVG, and SPLV.
SPLV showed the lowest mutation rate among the examined potyviruses. This result suggests SPFMV might play an important role in viral disease symptoms in coinfected sweet potato plants.
Here, we examined sweet potato viromes for two popular sweet potato cultivars in Korea, Beni Haruka and Hogammi. Beni Haruka originates from Japan, while Hogammi was recently developed by the Rural Development  Table S3. (b) Agarose gel electrophoresis results by RT-PCR with newly designed primer pairs. Full-length gels of RT-PCR results can be found in Fig. S1 in the Supplementary Information. Actin gene of sweet potato was used as positive control. We used the same total RNA for both NGS and RT-PCR. Green color indicates RT-PCR primer pairs for two novel viruses, SPVE and SPVF, as well as a variant of SPVG. (2020) 10:2588 | https://doi.org/10.1038/s41598-020-59518-x www.nature.com/scientificreports www.nature.com/scientificreports/ Administration (RDA) in Korea. As they are popular for roasting, many growers cultivate both cultivars. In the case of Beni Haruka, at least seven different viruses were identified, whereas five different viruses were identified from Hogammi. Interestingly, both cultivars were cultivated in the same field in Yeoju, Korea; however, Beni Haruka was more severely infected by viruses than Hogammi. The difference in the number of infected viruses between the two cultivars might be correlated with the cultivation period. That is, Beni Haruka has been cultivated for a long time, while Hogammi was introduced to Korean growers only three years ago by the RDA.
It is now possible to de novo assemble many viral genomes from transcriptome data 37 . Similarly, we obtained 12 complete virus genomes and 18 nearly complete genomes for eight viruses. A total of 30 assembled viral genomes were further used for phylogenetic analyses. Our phylogenetic analyses showed that many isolates of SPVC, SPLV, SPV2, and SPLCV in this study were grouped together with other known isolates from Korea. In general, it is likely that virus genomes are highly correlated with geographical regions. However, two isolates of SPFMV and a single isolate of SPVG were genetically distant from other known isolates. For example, two SPFMV isolates, YJ-H and YJ-B, were revealed as common ancestors of other known SPFMV isolates. Furthermore, SPVG isolate IS in this study and SPVG isolate WT325 from Taiwan were genetically distant from other SPVG isolates, suggesting they are new variants of SPVG.
As shown in other previous studies, the coinfection of multiple viruses in sweet potato resulted in the dramatic reduction of sweet potato production by up to 50% 38 . Unfortunately, most sweet potato cultivars in Korea were severely coinfected by many viruses 14 . There are several possible reasons explaining how viruses infect sweet potato plants. The first is vegetative propagation. Most growers purchase sweet potato sprouts from seedling markets, and they are often already infected by diverse viruses. Surprisingly, a recent study showed that most sweet potato germplasm (83.8%) in the Bioenergy Research Center of the RDA in Korea, which is used to develop new cultivars or provide sweet potato cuttings, was already infected by multiple viruses 14 . Furthermore, a recent study demonstrated that SPLCV can be transmitted by seeds, suggesting the newly developed sweet potato cultivars might also be highly infected by viruses 8 . Thus, it seems that the sweet potato virus control in Korea should be conducted from the early stages of breeding.
Based on the above evidence, it is necessary to develop virus-free sweet potato cultivars in order to prevent the damages caused by viral diseases, as suggested previously 38 . For that, knowledge of the viruses infecting sweet potato and diagnostic methods are needed. In this study, we provide a complete list of viruses infecting sweet potato in Korea and diagnostic methods, which could be valuable information for the development of a virus-free sweet potato in Korea in the near future. Moreover, we suggest that the development of a virus-free sweet potato in Korea should be carried out by not only national institutes such as the RDA but also universities and companies. We have to provide various reasonable choices to growers to promote competition among different developers of virus-free sweet potatoes. Dependence on a single national institute does not guarantee the production of a high-quality virus-free sweet potato, as we have seen.

Methods
Collection of sweet potato samples. We collected sweet potato leaf samples from eight different geographical regions in Korea in May 2016 and May 2017 (Table 1 and Fig. 1a). The eight regions are the main sweet potato producing areas in Korea. Most collected samples showed viral disease symptoms; however, we also collected leaf samples without any visible disease symptoms. In order to examine viruses infecting sweet potato in different geographical regions, leaf samples were pooled according to geographical regions and used for total RNA extraction. In addition, leaf samples from two major sweet potato cultivars referred to as "Beni Haruka" and "Hogammi" were also collected to compare sweet potato viromes in different sweet potato cultivars. As a result, a total of 357 samples were used for 10 different libraries.
Total RNA extraction and library preparation for RNA-Seq. Leaf samples were pooled and then frozen in the presence of liquid nitrogen. The frozen leaf samples were ground with a pestle and mortar. We used the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) to extract total RNA for RNA-Seq based on the manufacturer's instructions. The quality and quantity of extracted total RNA were measured by gel electrophoresis followed by using an Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, U.S.A.). Using the extracted total RNA, we generated 10 different RNA-Seq libraries using the NEBNext Ultra RNA Library Prep Kit for Illumina in accordance with the manufacturer's instructions (NEB, Ipswich, Massachusetts, U.S.A.). Briefly, poly(A)-tailed mRNA was extracted by using poly-T oligo-attached magnetic beads. We synthesized the first strand of cDNA from the extracted mRNA followed by a second strand of cDNA. The 3′ ends of the DNA fragments were adenylated. After adapter ligation, we performed PCR amplification to selectively enrich DNA fragments with adapters and amplify the large amount of DNA in the library. We again measured the quality and quantity of each library using the 2100 Bioanalyzer. The six prepared libraries from 2016 were paired-end sequenced by Theragen (Suwon, South Korea) using the HiSeq2300 platform, and the four libraries from 2017 were paired-end sequenced by Macrogen Co. (Seoul, South Korea) using the HiSeq2000 platform.
De novo transcriptome assembly and virus identification. In the bioinformatics analyses including the de novo transcriptome assembly and BLAST search, a workstation with two 20-core CPUs and 256 GB of RAM installed with Ubuntu 16.04.4 LTS was used. Based on the results of our previous studies 31 , we only used the Trinity program (version 2.0.2, released January 22, 2015) with default parameters for de novo transcriptome assembly 39 . Raw sequence reads from each library were de novo assembled using Trinity. We generated our own plant viral genome database by selecting only viruses infecting plant species from the viral reference database of the NCBI (https://www.ncbi.nlm.nih.gov/genome/viruses/). To identify virus-associated contigs, the assembled contigs (transcriptome) from each library were blasted against the plant viral genome database using MEGABLAST 40 with a cutoff E-value of 1e-6. The obtained virus-associated contigs were again (2020) 10:2588 | https://doi.org/10.1038/s41598-020-59518-x www.nature.com/scientificreports www.nature.com/scientificreports/ blasted against NCBI's NT database to remove sweet potato host sequences and other contaminated sequences. Ultimately, only virus-associated contigs were used for the study of sweet potato viromes. To calculate the number of virus-associated reads for identified viruses, the BBMap program was used (https://sourceforge.net/projects/ bbmap/).

Virus genome assembly and annotation.
Based on the BLAST results, the virus-associated contigs in each library were aligned on the identified virus reference genomes using the ClustalW program implemented in the MEGA7 program 41 . Several nearly complete viral genome sequences were obtained without any further alignment. Poly(A) tail sequences at the 3′ terminal were deleted. We again aligned raw sequence reads on the identified virus genome sequences to fill the missing gaps of the virus genome using a Burrows-Wheeler Aligner (BWA) program with default parameters 42 . To predict the ORFs in each virus genome, we used the ORF Finder program (https://www.ncbi.nlm.nih.gov/orffinder/). In addition, we manually checked the identified ORFs and the 5′ and 3′ untranslated regions (UTRs) by comparing the corresponding reference virus genome. Twelve complete viral genome sequences covering whole ORFs were deposited in NCBI's GenBank database with respective accession numbers (Table 5).

Phylogenetic analyses of identified viruses.
For phylogenetic tree analyses, we used the virus genome sequences assembled in this study (Table S2) as well as the available viral genome sequences from GenBank for eight viruses. We used RdRp amino sequences, CP amino sequences, and a complete genome sequence for SPVF and its homologous sequences, whereas we used a polyprotein amino acid sequence and a complete genome sequence for SPVE and its homologous sequences. In the case of six viruses (i.e., SPFMV, SPVC, SPVG, SPLV, SPV2, and SPLCV), complete genome sequences obtained in this study as well as available complete viral genome sequences from GenBank were used. Six SPFMV genomes, seven SPVC genomes, two SPVG genomes, three SPLV genomes, four SPV2 genomes, and four SPLCV genomes were used for the construction of phylogenetic trees in this study. For the individual virus, viral sequences were aligned using the ClustalW program. The aligned nucleotide sequences or amino acid sequences were used for phylogenetic tree construction using the MEGA7 program with the maximum likelihood method and 1,000 bootstrap replicates 41 .

Identification of SNPs for 12 assembled virus genomes. For virus SNP identification, it is important
to use the assembled virus genome sequences in each library as reference virus genome sequences to increase SNP specificity. We analyzed single SNPs for the 12 assembled virus genomes as described previously 27 . The raw sequence reads in individual libraries were aligned on the assembled viral genome using the BWA program with default parameters, resulting in the Sequence Alignment Map (SAM) files. The SAM files were converted into Binary Alignment Map (BAM) files using SAMtools 43 . After that, the sorted BAM files were used to generate the Variant Call Format (VCF) file format using the mpileup function of SAMtools for SNP calling. Finally, we called SNPs using BCFtools implemented in SAMtools. The positions of identified SNPs and mapped reads on each viral genome were visualized by the Tablet program 44 .
RT-PCR assay. In order to confirm the results of RNA-Seq, we carried out RT-PCR. For that, we designed 12 RT-PCR primer pairs. An actin gene of sweet potato was used as a positive control. In the case of SPVG, two different primer pairs, SPVG and SPVG_IS, were designed. The SPVG_IS-specific primer pair can amplify the variant of SPVG identified from the IS region. Detailed information of designed primer pairs can be found in Table S3. The regions of each virus amplified by RT-PCR can be found in Fig. 7a. We used the same total RNA from the pooled samples as a template RNA for the RT-PCR assay. RT-PCR was performed using the DiaStar OneStep RT-PCR Kit (SolGent, Daejeon, Korea). As described previously 31 , the RT-PCR conditions were 50 °C for 30 min, 95 °C for 15 min, followed by 30 cycles at 95 °C for 20 sec, 50 °C to 56 °C for 40 sec (the annealing temperature can be varied depending on the Tm values of primers), and 72 °C for 1 min, with a final extension at 72 °C for 5 min. We checked the amplified RT-PCR products by gel electrophoresis followed by EtBr staining. Furthermore, we cloned the amplified RT-PCR product in the pGEM-T-Easy Vector (Promega, Wisconsin, US) followed by Sanger sequencing to confirm the sequences of amplified PCR products.
To confirm complete genome sequences of SPVE and SPVF, we carried out RT-PCR using newly designed primers (Table S4 and Fig. S2). The amplified RT-PCR products were visualized by gel electrophoresis (Fig. S2). We confirmed amplicon sequences by cloning into the pGEM-T-Easy Vector and Sanger sequencing.

Data availability
The raw dataset in this study will be available, upon publication, in the Sequence Read Archive (SRA) repository with accession numbers SRR8489804, SRR8489848, SRR8492257, SRR8492258, SRR8492261, SRR8492260, SRR8492261, SRR8492262, SRR8492263, and SRR8492264. The 12 viral genome sequences obtained from this study were also deposited in GenBank, NCBI, with respective accession numbers.