Nearly complete genome sequences of the first two identified Colorado potato beetle viruses

The Colorado potato beetle is one of the most devastating potato pests in the world. However, its viral pathogens, which might have potential in pest control, have remained unexplored. With high-throughput sequencing of Colorado potato beetle samples derived from prepupal larvae which died from an unknown infection, we have identified two previously unknown RNA viruses and assembled their nearly complete genome sequences. The subsequent genetic and phylogenetic analysis demonstrated that the viruses, tentatively named Leptinotarsa iflavirus 1 and Leptinotarsa solinvi-like virus 1, are the novel representatives of the Iflaviridae and Solinviviridae families, respectively. To the best of our knowledge, these are the first sequencing-confirmed insect viruses derived from Colorado potato beetle samples. We propose that Leptinotarsa iflavirus 1 may be associated with a lethal disease in the Colorado potato beetle.

The Colorado potato beetle-Leptinotarsa decemlineata of Chrysomelidae family-is one of the most widespread and destructive Solanaceae plants pests in the world.Although the use of insecticides helps to decline the beetle's populations, the Colorado potato beetle is notorious for its ability to develop resistance to all chemicals registered to date 1 .That is why there is an urgent need for new effective methods to control Colorado potato beetle populations and also to develop new biocontrol strategies as the most environmentally friendly 2 .Currently, the existing methods of biological control of Colorado potato beetle populations use entomopathogenic fungi such as Beauveria bassiana, Metarhizium sp. 3 and bacteria Bacillus thuringiensis 4 , but not viruses.However, entomopathogenic virus-based bioagents that are species-specific and generally safe for non-target organisms are valuable tools for developing integrated pest management strategies.As for other insect pests, positive examples include the nucleopolyhedroviruses of the Baculoviridae family, which are currently used to control a number of Lepidoptera species 4 .
In a recent study we presented the results of metagenomic analysis of viral diversity in non-target Colorado potato beetle samples-namely, in public genomic and transcriptomic Leptinotarsa decemlineata NGS data obtained from the NCBI SRA database 5 .In these data, we have identified virus-associated genetic sequences belonging to more than 90 species and 32 families of viruses (excluding bacteriophages), a significant part of which were RNA viruses.It is worth noting that, targeted studies dedicated to the identification of Colorado potato beetle viruses have not been conducted to date.
When growing natural populations of the Colorado potato beetle under laboratory conditions 6,7 , we observed larvae death rate of 8-35% (upto 60%) when they completed feeding and entered into the prepupal stage (7-11 days after molting at instar IV).The dead larvae were characterized by specific phenotype with body turgor loss, body straightening and legs stretching (Fig. 1).1-3 days post mortem the hemolymph acquired a dark tint characteristic to septicemia (Fig. 1C).To the best of our knowledge these particular disease specific symptoms were not described earlier in Colorado potato beetle.In the samples of dead Colorado potato beetle prepupae we identified two novel RNA viruses, one of which was attributed to Iflaviridae family, and the second-to Solinviviridae family.
Members of the Iflaviridae family (order Picornavirales) have single-stranded (+)RNA genomes of 9-11 kb.Their genome contains a single open reading frame encoding a polyprotein that is proteolytically processed into functional viral proteins 8 .More than 550 genomic iflavirus sequences have been published to date to the NCBI GenBank database, of which 134 were published in 2021, 105 in 2022 and 42 in 2023.Iflaviruses infect arthropods with most of the described representatives infect insects.Iflaviruses target specific tissues of the insect host, often affecting crucial physiological processes.They can oppose the host immune responses by interfering with antiviral defense mechanisms and ensuring successful viral replication 9 .In some cases iflaviruses cause persistent infection, which makes them particularly interesting 10 .
Members of the Solinviviridae family (order Picornavirales) have single-stranded (+)RNA genomes of 10-11 kb.Solinviviridae viruses infect arthropods, with most described representatives infecting insects and crustaceans 11,12 .Some of these viruses are known to cause chronic infections and others, which are more virulent, cause systemic infections and acute mortality, however little is currently known about the effects of the most of these viruses on their hosts 12 .Only about 30 genome sequences of Solinviviridae viruses have been published to NCBI GenBank database to date.

Assessment of viral load with quantitative PCR
In our previous analysis of public Colorado potato beetle NGS data we were able to assemble several near fulllength genomes of viruses related to Iflaviridae family 5 and thus we proposed that the observed Colorado potato beetle prepupal larvae death might be at least partially attributed to iflaviral infection.To test this possibility, we undertook a qPCR study to assess the quantities of viral genetic material in symptomatic dead prepupae in comparison to asymptomatic healthy ones.The viral RNA-dependent RNA polymerase (RdRp) gene was selected to assess the viral load and L. decemlineata ribosomal protein RP4 and RP18 genes were taken as reference ones, as it was proposed in 13 .The designed oligonucleotide primers are shown in Table 1.The qPCR study demonstrated the increased levels of viral RdRp in fresh dead symptomatic larvae ("deceased") as compared to healthy asymptomatic ones (see Fig. 1A,B)-the mean relative RdRp expression in "deceased" group was about 32.9 times higher than in "alive" (Fig. 2).And the observed difference was found to be statistically significant (p = 0.0357).
Thus, the significantly increased levels of viral RdRp were observed in tissues of freshly dead larvae with characteristic symptoms, which indicates active virus replication.Unfortunately, in current study it was impossible to attribute the larvae death to this particular virus infection.To prove that the virus causes lethal infection in Colorado potato beetle, it is necessary to purify the viral particles in order to determine their infectivity and lethal dose in further in vitro and in vivo studies or with producing infectious cDNA.And it is extremely important to determine the genome sequence of the target virus.Table 1.The selected genes and oligonucleotide primer sequences used for qPCR.The primers for Rp4 and Rp18 were taken from paper Shi et al. 13 and modified by Rotskaya U.

Sequencing and assembly of viral genome sequences
The sequencing yielded 473.5 × 2 thousand reads, with an average length of 124 nt, for the CPB3 sample and 480.5 × 2 thousand reads, with an average length of 127 nt, for the CPB6 sample.The de novo assembly yielded a nearly complete genome sequence of an iflavirus named Leptinotarsa iflavirus 1 (in sample CPB3) with an average coverage depth of 834x (5.2% of total reads were back aligned to the assembled genomic sequence with at least 90% overlap and 95% identity) and a nearly comlete genome sequence of solinvi-like virus named Leptinotarsa solinvi-like virus 1 (in sample CPB6) with an average coverage depth of 1116x (8.7% of total reads were back aligned to the assembled genomic sequence with at least 90% overlap and 95% identity).The genetic sequences of Leptinotarsa iflavirus 1 and Leptinotarsa solinvi-like virus 1 were deposited to GenBank (with accession IDs OR613011 and OR613010, respectively).Both viruses were found in both analyzed samples, but due to much lower coverage depth we were unable to assemble the contigs containing the complete CDS of Leptinotarsa iflavirus 1 from sample CPB6, and contigs containing the complete CDS of Leptinotarsa solinvi-like virus 1 from sample CPB3.

Phylogenetic analysis of viral RdRp proteins
Since iflaviral sequences are very diverse, the phylogenetic analysis was performed using the most conserved polyprotein fragments containing RdRp.The multiple amino acid sequence alignment was built using the sequences with the highest homology to Leptinotarsa iflavirus 1 RdRp (at least 30% identity) selected from NCBI nr database using BLASTp.The phylogenetic tree was constructed with IQ-Tree software using the maximum likelihood method (Fig. 4).The sequences closest to Leptinotarsa iflavirus 1 RdRp were the RdRp of Apis iflavirus 2 (UCR92484), derived from metagenomic samples of honey bees collected in China in the Henan province in 2017, and the RdRp of Lampyris noctiluca iflavirus 1 (QBP37019), obtained from metagenomic analysis of a firefly sample collected in Finland in 2017.
To construct the phylogenetic tree of Leptinotarsa solinvi-like virus 1 RdRp, we used the multiple RdRp amino acid sequence alignment taken from ICTV (https:// ictv.global/ sites/ defau lt/ files/ inline-images/ Figur e3-1.v2.aa.align ment.fst, accessed on August 2023) 11 .The alignment was augmented with sequences closely related to Leptinotarsa solinvi-like virus 1 RdRp extracted from NCBI nr database with BLASTp (with at least 30% identity).Figure 5 demonstrates a fragment of the resulting phylogram produced with IQ-Tree.
The Solinviviridae sequences extracted from ICTV were also found to include RdRp encoded by the putative viral genetic sequence Leptinotarsa decemlineata TSA (GenBank: GEEF01170301), obtained from transcriptomic assembly of CPB samples (BioProject ID: PRJNA297027).This RdRp has the highest homology with the Leptinotarsa solinvi-like virus 1 RdRp among all the compared sequences; the nucleotide sequence identity was 97% (1212/1251).We have also performed pairwise comparison of the full-length polyproteins encoded by Leptinotarsa solinvi-like virus 1 genome and Leptinotarsa decemlineata TSA sequence.The total length of the polyprotein sequence encoded by Leptinotarsa decemlineata TSA is 3819 aa.The identity of the amino acid sequences of the compared polyproteins is 96.86% (3699 out of 3819 aa), most amino acid substitutions were found to be conservative (the sequence homology according to BLOSUM62 equals to 98.56%).As compared to the Leptinotarsa solinvi-like virus 1 polyprotein, a fragment of 19 aa is inserted.The comparison of nucleotide sequences demonstrated that the Leptinotarsa decemlineata TSA sequence lacks terminal fragments, including the polyA tail.The polyprotein encoding nucleotide sequence of Leptinotarsa decemlineata TSA contains 480 SNPs, including 379 transitions, 101 transversions, 360 synonymous and 120 missense substitutions (with 55 being conservative), and 1 deletion (57 nt).Thus, it appears that the Leptinotarsa decemlineata TSA sequence (GEEF01170301) corresponds to a near full-length genomic sequence of a related solinvi-like virus.In addition to Leptinotarsa decemlineata TSA, the other closest to Leptinotarsa solinvi-like virus 1 RdRp were the two RdRp sequences of Hangzhou solinvi-like virus 1 (NCBI Protein: UHR49768 and UHR49784), derived from metagenomic samples of Altica cyanea leaf-feeding beetle (Chrysomelidae) collected in China in 2016.
Thus, for the first time the full-length genome sequences of viruses belonging to the order Picornovirales were obtained from natural samples of Colorado potato beetle.The phylogenetic analysis allowed to attribute these viruses as representatives of the iflaviruses and solinvi-like viruses-Leptinotarsa iflavirus 1 and Leptinotarsa solinvi-like virus 1, respectively.The analysis of amino acid sequences of polyproteins encoded by the genomes of these viruses confirmed the presence of characteristic viral proteins and demonstrated the correspondence of their domain structure to that of other described iflaviruses and solinvi-like viruses, respectively.
However, we cannot be sure that the observed L. decemlineata larvae death was caused by these particular viruses, although the external symptoms (delayed septicemia) and qPCR data (near 30-fold increase of viral load as compared to healthy insects), as well as the detection of these particular viruses in tissues of dead symptomatic larvae with NGS, speak in support of this hypothesis.Noteworthy is the death of Colorado potato beetle at a certain period of their development (at transition to the prepupa stage), and we assume this might be associated with increased insects' vulnerability to generalized viremia at the beginning of body restructuring period.Further studies will be aimed at isolating the viral particles and assaying their infectivity and pathogenicity in detail using both in vitro and in vivo models.We believe that our results may contribute to future development of novel Colorado potato beetle biological control methods.

Biological samples collection
Leptinotarsa decemlineata III-IV instar larvae were collected from private potato fields free from any insecticide treatment (Karasuk, Novosibirsk region, Russian Federation; 53°43′N; 77°38′E).Due to these potato fields not being located in protected areas, there was no need for special permission to collect beetles.The landowners did not prevent access to the fields.Endangered or protected species were not used in this work.The insects were kept in laboratory conditions as described earlier 6,7 .The larvae were kept in ventilated plastic containers (300 ml; 10 larvae per container) at temperature of 24-25 °C and 16:8 h light/dark period daily fed with fresh Solanum tuberosum foliage.Insects that died in the prepupal stage with presumable viral infection symptoms (Fig. 1) were immediately frozen in liquid nitrogen and then stored at − 80 °C until RNA extraction.

Samples preparation and qPCR
The whole larvae were frozen in liquid nitrogen and stored at − 18 °C.The samples were lyophilized at − 53 °C and 400 mTorr for 24 h.The lyophilized corpses were homogenized by micro pestles in liquid nitrogen and treated with Lira-reagent (BioLabMix Ltd., Novosibirsk, Russia).The subsequent total RNA extraction was performed according to Lira-reagent protocol.DNase treatment was carried out according to DNase I (RNase-free) kit (TransGen Biotech Co. Ltd., China) protocol.The reverse transcription of RNA to cDNA was performed with RevertAidTM M-MuLV Reverse Transcriptase (Fermentas, Vilnius, Lithuania) and 2.0 pMol of 9N primers.qPCR was performed as described earlier 14,15 .The PCR reaction conditions were the following: 95 °C for 3 min, followed by 40 cycles of 94 °C for 15 s and 64 °C for 30 s, followed by the melting curve analysis (70-90 °C).The L. decemlineata ribosomal protein RP4 and RP18 genes were taken as reference, as it was proposed in 13 .The viral RNA-dependent RNA polymerase (RdRp) gene was selected to assess the viral load.Primers were designed with Primer-BLAST tool 16 and IDT OligoAnalyser 3.1 17 .The designed oligonucleotide primers were synthesized by Biosset Ltd. (Novosibirsk, Russia).The primer sequences are provided in Table 1.

qPCR gene expression calculations and statistical analysis
Gene expression was calculated using the 2ΔΔCq method with Bio-Rad CFX manager software (Bio-Rad, USA).The program Past 4.03 was used for analyses qPCR data.The statistical significance of observed differences in normalized viral RdRp expression values in dead larvae samples with presumable viral infection (N = 5) and in asymptomatic larvae samples (N = 3) was examined with two-sided Mann-Whitney test (the critical significance level was set to 0.05).The statistical analysis was performed with R statistical analysis environment (v.4.2.1) 18 .

Samples preparation and sequencing
Individual dead CPB prepupae were homogenized in 1 ml of standard PBS manually in Eppendorf 1.5 ml microtubes using sterile Axigen disposable pestles (PES-15-B-SI, Axigen, USA).The homogenates were centrifuged on Eppendorf Minispin centrifuge at 10000g for 10 min, the supernatants were collected into 15 ml tubes and diluted 10 times with sterile PBS.The resulting solutions were filtered through a Millex-HV filter pad with a pore diameter of 0.45 nm (Millex ® PVDF syringe filter, Merk, USA) and the filter pads were additionally washed with 5 ml of sterile PBS.The resulting solutions were concentrated to a volume of 200 μl using Amicon ® Ultra 4 50K centrifuge filters (Merk Millipore Ltd, Ireland) in Eppendorf 5804 centrifuge at 3000g with adapters for 15 ml tubes.The aliquots of 100 μl of the resulting solutions were supplied with 2 μl of MgCl 2 (100 mM), shaken and 10 μl of benzonase (Sigma, USA) with an activity of 250 units/μl was added.The suspension was incubated at 37 °C for 30 min and then EDTA was added to inactivate the benzonase.
Then total RNA was extracted from 100 μl of benzonase-treated sample with TriZol (ThermoFisher Scientific, USA) according to the recommended protocol.40 μg of glycogen in the form of an aqueous solution with a concentration of 20 mg/ml was used as a co-precipitant.The precipitate was dissolved in 30 μl of deionized water, then the solution was used to obtain the double-stranded DNA fragments with SISPA (Sequence-Independent, Single-Primer Amplification) protocol, according to 19 .
The resulting dsDNA fragments were purified from unspent components and reaction products with AMPure beads (Beckman Coulter), and then used to prepare NGS libraries.Nucleic acid concentrations were measured with Qubit 3.0 using the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific).The NGS library preparation was carried out using the NEBNext Ultra II FS DNA Library Prep Kit for Illumina (NEB, USA), which performs fragmentation, end repair and dA-tailing, and adapter ligation with a single enzyme mix.The NEBNext Multiplex Oligos for Illumina (Index Primer Set 1) (NEB, USA) was used for multiplexing.The sequencing was performed on Illumina iSeq 100 platform (300 cycles).

Figure 2 .
Figure2.Iflaviral RdRp relative expression in samples taken from alive and deceased Colorado potato beetle prepupae.The relative RdRp expression was determined as detailed in corresponding "Materials and methods" section.The RdRp expression values were normalized to reference genes: ribosomal proteins L4 and L18 (Rp4 and Rp18).The difference in viral loads was found to be statistically significant (Mann-Whitney two-tailed test, p = 0.0357).

Figure 4 .
Figure 4.The phylogenetic tree of RdRp amino acid sequences of Leptinotarsa iflavirus 1 and closely related iflaviruses and picorna-like viruses.The tree was constructed with IQ-Tree by the maximum likelihood method.The numbers next to the branching points denote the support indices (≥ 70%) determined by boot strap statistical analysis (1000 replications).The tree is drawn to scale, with branch lengths corresponding to evolutionary distances.

Figure 5 .
Figure 5.The phylogenetic tree of RdRp amino acid sequences of Leptinotarsa solinvi-like virus 1 and closely related solinviviruses and solinvi-like viruses.The tree was constructed with IQ-Tree by the maximum likelihood method.The numbers next to the branching points denote the support indices (≥ 70%) determined by boot strap statistical analysis (1000 replications).The tree is drawn to scale, with branch lengths corresponding to evolutionary distances.The tree is a part of a bigger one (not shown).