Baculoviruses are a large family of rod-shaped, invertebrate-infecting viruses with large circular, covalently closed, double-stranded DNA genomes of between 80 and 180 kb. This family was initially taxonomically subdivided into nucleopolyhedroviruses (NPVs) or granuloviruses (GVs) based on viral occlusion morphology1. However, when an increasing number of genome sequences became available, it was clear that lepidopteran NPVs and GVs are more closely related to each other than to dipteran and hymenopteran NPVs. Therefore, a new taxonomic division that follows the evolution of the host more closely2 was accepted by the International Committee on Taxonomy of Virus (ICTV). In the 10th report of the ICTV (online, 2019), the family Baculoviridae was still divided into four genera: Alphabaculovirus, Betabaculovirus, Deltabaculovirus and Gammabaculovirus ( To date, 85 baculovirus genomes have been sequenced (, including 55 from Alphabaculovirus (lepidopteran NPVs), 26 from Betabaculovirus (lepidopteran GVs), 1 from Deltabaculovirus (dipteran NPVs) and 3 from Gammabaculovirus (hymenopteran NPVs).

Betabaculoviruses are granuloviruses (GVs) infecting only lepidopteran hosts, whereas alphabaculoviruses, deltabaculoviruses and gammabaculoviruses are nucleopolyhedroviruses (NPVs) isolated from a wider range of hosts, including lepidopterans, dipterans and hymenopterans.

Lepidopteran NPVs are further divided into two groups, I and II, based on gene content3. Notably, the budded virus (BV) fusion protein in Group I NPVs is GP64, whereas Group II NPVs lack gp64 and utilize the F protein4. GVs are classified into three types according to tissue tropism5. Type I GVs, such as Xestia c-nigrum GV (XcGV), kill hosts at a slow speed by only infecting the midgut epithelium and fat body tissue6. Type II GVs, such as Cydia pomonella GV (CpGV), kill hosts at a rapid speed, similar to typical lepidopteran NPVs, by infecting most of the host’s major tissues7. Type III GVs infect only the midgut epithelium. Only one GV, Harrisina brillians GV (HabrGV)8, has been identified as Type III. Phylogenetic analysis on the basis of conserved genes of GVs does not show certain monophyletic origins for these different types of pathogenesis9.

Mythimna unipuncta granulovirus (MyunGV-A), originally described as Pseudaletia unipuncta granulovirus (PsunGV) based on an isolated Hawaiian population of Mythimna (Pseudaletia) unipuncta10, was identified as PsunGV by the ICTV in 2002. Until 2017, PsunGV was proposed to be renamed MyunGV-A by the ICTV to reflect the fact that the new species MyunGV-B is the second distinct betabaculovirus to be isolated from the host Mythimna (Pseudoletia) unipuncta.

MyunGV-A (PsunGV-H) was first discovered by synergistic factors (described later as enhancin)10. Subsequent studies on MyunGV-A mostly focused on the mechanisms of enhancement and the enhancin gene. The enhancin of MyunGV-A can interact with viral particles and increase the binding of viral particles to insect midgut microvilli, thereby dramatically promoting the oral infectivity of Mythimna unipuncta NPV and decreasing the larval survival time11. The enhancin of MyunGV-A comprising 901 amino acids have been purified and characterized12. Overall, high-throughput sequencing of baculovirus genomes appears to be essential for analysing the molecular mechanisms of baculovirus infection and understanding baculovirus genome evolution. In this study, the morphological characteristics of MyunGV-A were observed by electron microscopy (EM). We present the complete sequence and organization of the MyunGV-A genome and compare it with other baculoviruses by genomic and phylogenetic analysis. A total of 24 OB proteins of MyunGV-A were identified.

Materials and methods

Virus preparation and DNA extraction

MyunGV-A (PsunGV-H) was obtained from Tanada Y. and kept at the Institute of Zoology, Chinese Academy of Sciences13. The virus was propagated in laboratory stocks of healthy second-instar M. separate larvae by per os infection. The occlusion bodies (OBs) produced in larval cadavers were purified by a standard method14.

To extract viral DNA, the purified OBs were resuspended in 0.1 M sodium carbonate solution [0.1 M Na2CO3, 0.17 M NaCl, 0.01 M EDTA (pH 10.5)] and incubated at 37 °C for 1 h. The pH was adjusted to 7.0 with 0.1 M HCl. Sarcosyl 0.5% and proteinase K 0.25 mg/mL were added to the sample and incubated at 37 °C for 2 h and 65 °C for 2 h. Genomic DNA was extracted with an equal volume of phenol and chloroform. The DNA was precipitated with two volumes of 100% ethanol, washed with 70% ethanol, and dissolved in TE buffer [10 mM Tris–HCl (pH 8); 1 mM EDTA].

Electron microscopy observation

OBs of MyunGV-A were observed by scanning electron microscopy (SEM; Hitachi S3400N) and transmission electron microscopy (TEMl; JEOL JEM1230) according to standard methods15.

DNA sequencing and analysis

A random genomic library of MyunGV-A was constructed according to the “partial filling-in” method16. A total of 831 recombinant plasmids containing 1.5 to 5.0 kb viral DNA fragments were prepared for sequencing using a BigDye Terminator v3.1 (ABI) and a 3130XL Genetic analyser (ABI). The combined sequence generated from these clones represented sixfold genomic coverage. The gaps and ambiguities in the assembled sequence were resolved by PCR. All sequences were assembled into contigs using SeqMan from the DNASTAR 7.0 software package.

ORFs were defined using ORF Finder ( The criterion for defining an ORF was a size of 50 or more codons with minimal overlap. DNA and protein comparisons were performed using BLAST ( For protein homology detection, we used the HHpred webserver for the translated ORFs17,18. Multiple alignments and percentage identities were obtained using ClustalW. Promoter motifs present upstream of the putative ORFs were screened as described previously19. Identity among homologous genes was determined with MegAlign software using ClustalW with default parameters. Homologous repeat regions (hrs) were analysed by Tandem Repeats Finder ( GeneParityPlot analysis was performed as described by Hu et al.20.

Protein analysis of OBs of MyunGV-A

Fresh purified OBs of MyunGV-A suspended in ddH2O were incubated with an equal volume of lysis buffer (0.1 M Na2CO3, 0.17 M NaCl, 0.01 M EDTA, pH 10.6) at 4 °C for 1 h. The pH was adjusted to 8.0 with 0.1 M HCl. The samples were added to 10 mM Tris–HCl containing β-mercaptoethanol (0.2%) and sodium dodecyl sulfate (SDS) at 95 °C for 10 min. The proteins of MyunGV-An OBs were separated by SDS-PAGE using an 8% to 15% gradient gel. The protein bands were excised into 29 samples according to molecular weight from small to large for LC–MS/MS analysis (LCQ Deca Xp plus, ThermoFinnigan). LC–MS/MS analysis and protein identification were performed as described by Shi XF21. The raw files of MS spectra were searched against the putative protein database of MyunGV-A (NC_013772.1).

Phylogenetic analysis of MyunGV-A

The amino acid sequences encoded by the 38 core genes described for all members of family Baculoviridae22 of 82 complete baculovirus genomes (excluding 3 incomplete genomes) in the NCBI genome database ( were joined together according to a consistent order (ORF order of AcMNPV) and aligned using MAFFT with default parameters. A phylogenetic tree based on these sequences was constructed using MEGA 7 MEGA 7.0.1423. Maximum likelihood (ML) tree construction methods were used with 1000 bootstrap resamples. The GTR + G + I substitution model was used for ML analysis.

Results and discussion

Electron microscopy observation

SEM revealed that the purified OBs of MyunGV-A have elongated ellipse shapes, with a length of approximately 0.5 μm and a width of approximately 0.3 μm (Fig. 1A). TEM showed a single rod-shaped ODV of approximately 300 nm in length and 40 nm in width embedded in a granular OB (Fig. 1B,C). These are typical GV morphological characteristics.

Figure 1
figure 1

A scanning electron micrograph of MyunGV-A (A) and transmission electron micrograph of MyunGV-A (B,C).

Sequence and genome characteristics of MyunGV-A

The size of the MyunGV-A genome is 176,677 bp (GenBank accession no. NC_013772), with a G+C content of 39.79%. MyunGV-A is the second largest GV sequenced to date, with XcGV (178,733 bp)6 being larger. Computer-assisted ORF analysis detected 372 ORFs of 50 or more codons and 9 homologous regions (hrs) in the MyunGV-A genome; 189 ORFs overlap significantly or are completely contained within other MyunGV-An ORFs. The deduced protein sequences of these 189 ORFs show no significant homology to protein sequences in GenBank. The remaining 183 ORFs and 9 h are shown in Table 1 according to location, orientation, size of the predicted amino acid sequence, potential baculovirus homologues, best matched baculovirus ORF and BLAST score (bits).

Table 1 MyunGV-A (PsunGV-H) open reading frames (ORFs) and homologous repeat regions (hrs).

The first nucleotide of the granulin start codon was defined as nucleotide 1, and the ORF encoding granulin was accordingly designated as the first ORF. The putative ORFs were numbered sequentially in this orientation. Ninety-nine ORFs are in the granulin-sense orientation and 84 in the opposite orientation. A total of 183 putative ORFs of MyunGV-A were searched for promotor motifs at 180 bp upstream of the initiation codon of each ORF; only 42 were found to have a canonical baculovirus early gene promoter motif (a TATA box followed by a CAGT or CATT motif 20 to 40 bp downstream)24,25. Seventy-five ORFs only possess a late promoter motif ((A/T/G) TAAG); 75 contain both early and late promoter motifs, which might allow transcription during both early and late stages of infection. Thirty-four lack any recognizable canonical promoter motif.

Comparison of MyunGV-An ORFs to other baculoviruses

Comparison of gene organization and homology between MyunGV-A and other baculovirus genomes provides insight into gene conservation and implications for the diversity of baculoviruses. MyunGV-A shares 88 ORFs with AcMNPV, 166 with XcGV, 169 with HearGV and TnGV, 139 with MyunGV-B and 104 with CpGV (Table 1). The average amino acid sequence identities of homologous ORFs between MyunGV-A and AcMNPV, XcGV, HearGV, TnGV, MyunGV-B and CpGV are 34%, 79%, 79%, 98%, 62% and 44%, respectively. A total of 180 ORFs were assigned a function or are homologous with other baculoviruses, of which three ORFs (68, 69 and 147) have homologues only with TnGV. ORF68 and ORF147 share 100% homology with TnGV but ORF69 94%. In addition, ORF69 has 37% homology with a kind of bacterium, Zooshikella ganghwensis. Three ORFs, ORF113, -133 and -166, were identified as unique to MyunGV-A.

GeneParityPlot analysis

The gene order of MyunGV-A was compared with that of AcMNPV, XcGV, HearGV, TnGV, MyunGV-B and CpGV by GnenParityPlots analysis (Fig. 2)20. The gene organization of MyunGV-A is distinctly different from that of AcMNPV, except for two reverse collinear gene clusters in which one is a 12-gene group including the core gene cluster of four genes, lef-5, 38K(ac98), ac96, and helicase, with relative positions that are conserved in baculovirus genomes26. In contrast, the gene order of MyunGV-A exhibits extensive collinearity with XcGV, HearGV, TnGV, MyunGV-B and CpGV, except for several genes in a different order that are almost bro or near bro, with the highest collinearity to TnGV. Interestingly, the arrangement of the MyunGV-A genome shows lower collinearity to MyunGV-B, a virus from the same host, than to XcGV, HearGV and TnGV.

Figure 2
figure 2

GeneParityPlots analysis.

Homologous regions (hrs)

A typical feature of most baculovirus genomes is the presence of homologous regions (hrs) interspersed throughout the genome. The numbers of hrs in 82 complete baculovirus genomes range from none to 17, with 12 baculovirus genomes lacking typical hrs sequences (Table S1). In general, hrs are characterized by AT-rich and imperfect, reiterated palindromic sequences that may be replaced with direct repeats.

Eight major hr sequences (hr1-8) and one short hr sequence (hr5a) were identified in the MyunGV-A genome (Table 1). hr1-8 contains two to five direct imperfect repeats, each of approximately 120 bp, whereas hr5a does not contain multiple repeated sequences. It is interesting to note that hr5a is located in ORF122 (vp91), and the same situations exists in the XcGV and HearGV genomes. Six hrs were identified in the MyunGV-B genome lacking sequences corresponding to hr1 and hr5/5a of MyunGV-A27. No hrs were found in the TnGV genome deposited in 2018 (NC_038375.1), and there is no publication on the analysis of the sequence.

Although the nucleotide sequences of repeats vary between each hr, even in the same hr, two highly conserved 10 bp core sequences (TTAAT (G/A) TCGA) were found at the roughly same positions (approximately 35 bp) of each repeat6. In the MyunGV-A genome, the core sequences in each repeat of hr1, -2, -4, -7 and -8 are in the same directions, while those of hr3, -5, -5a and -6 are in opposite directions (Fig. 3).

Figure 3
figure 3

Alignment of homologous regions in the MyunGV-A genome. The conserved 10 bp core sequences (TTAATG/ATCGA) are indicated by shaded boxed. The arrows indicate direction of core sequences.

Hrs have been reported to function in replication origins28,29 and serve as enhancers of transcription of early genes30. In addition, the number of hrs is connected to the replication efficiency or pathogenicity of a baculovirus. Deletion of one to five hrs of AcMNPV had little or no effect on virus infection, while deleting six or seven hrs resulted in 90% BV reduction. Deletion of all eight hrs caused 99.9% BV reduction and delay of early and late gene expression but did not completely inhibit virus production31.

Baculovirus repeated ORFs (bro genes)

Bro genes have been identified in most baculovirus genomes sequenced to date. The number of bro genes in different baculovirus genomes varies considerably. Thirteen of 82 complete baculovirus genomes have only one bro gene, though Lymantria dispar MNPV (LdMNPV) has 16 bro genes. Bro genes are entirely absent from 19 baculovirus genomes sequenced to date (Table S1).

In MyunGV-A, 12 bro genes were identified, of which 3 adjacent bro genes (ORF135, -136, -137) were found. BLAST results of amino acid sequences of these 3 bro genes in NCBI showed that ORF135 best matches with TnGV ORF127 (81%), ORF136 with HearGV ORF54 (69%), and ORF137 with TnGV ORF127 (82%). The TnGV genome have 4 adjacent bro genes (ORF124, -125, -126, -127), and the HearGV genome has 3 pairs of adjacent bro genes (hear54 and -55, hear101 and -102, hear158 and -159), but no adjacent bro genes were found in the XcGV and MyunGV-B genomes.

The exact function of bro genes is not yet clear, though their presence is very significant for baculoviruses. Studies on the function of bro genes have mostly focused on BmNPV and have found that BRO-A and C proteins can bind to DNA in infected cells32; BRO-A may be involved in influencing host DNA replication, similar to a laminin-binding protein33.

In addition, BmNPV BRO proteins act as nucleocytoplasmic shuttling proteins via the CRM1-mediated nuclear export pathway34. Recently, BmNPV BRO-B and E proteins associated with host T-cell intracellular antigen 1 homologue (BmTRN-1) were shown to be involved in the inhibitory regulation of certain mRNAs at the post-transcriptional level during infection35. The function of other baculovirus BRO proteins has seldom been reported.

Two repeat genes in MyunGV-A

Two repeat genes (ORF39 and ORF49), with amino acid sequence identities of 100%, were found in the MyunGV-A genome; the former is in the granulin-sense orientation and the latter in the opposite orientation.

There is no homologous gene with these two genes in the XcGV, MyunGV-B and CpGV genomes. Indeed, only one gene, ORF43, of the TnGV genome matches with them, and the amino acid sequence identity is 97%. Two genes, ORF53 and ORF157, in the HearGV genome are homologous, with amino acid sequence identities of 86% and 85%, respectively, and the amino acid sequence identity of ORF53 and ORF157 in the HearGV genome is 99%. One gene with two copies in one baculovirus genome was found in other baculovirus genomes, such as odv-e66, p26 and dbp of EcobNPV36 and odv-e66 and p26 of SfMNPV37.

BLAST results of amino acid sequences of these two homologous genes in MyunGV-A in NCBI suggested they match hr3 and hr4 of Heliothis virescens ascovirus 3e (amino acid sequence identities both 49%). In addition, they match the 70.4-kDa C-terminal Zn-finger DNA-binding domain of Spodoptera frugiperda ascovirus 1a (amino acid sequence identities of 48%), which suggests that their function may be associated with DNA binding.

ORFs with no homologues in other baculoviruses

Three ORFs, including ORF113, -133 and -166, were identified as having no homologues in other baculoviruses (Table 1). These three unique ORFs have no recognizable promoter. Protein homology analysis using HHpred showed that GP133 (aa 50–359) is a likely homologue of Mannan-binding lectin serine peptidase 1 (probability, 99.97%; E value, 1.1e-28). Mannan-binding lectin serine peptidase 1 plays a central role in the initiation of the complement lectin pathway38. This homology indicates that ORF133 might be related to the complement lectin pathway, which deserves further research. ORF113 encodes an 8.5-kDa protein with one transmembrane domain (aa 5–27, analysed by TMHMM server v2.0) at the N terminus of the protein with no similarity to any proteins in the nonredundant protein database. ORF166 encodes a 7.7-kDa protein with no similarity to any proteins in the nonredundant protein database.

The large gene in MyunGV-A

In most cases, helicase is the largest gene in baculovirus genomes; however, in the MyunGV-A genome, ORF45 encoding 1213 amino acids (longer than helicase-1, 1158) is the largest gene. Similar situations are present in the HearGV (ORF44, 1279 aa), TnGV (ORF39, 1213 aa) and MyunGV-B (ORF45, 1507 aa) genomes, though it is divided into two genes, ORF47 and ORF48, in XcGV6. Compared with XcGV, the MyunGV-A genome has an additional adenosine (A) at position40315, resulting in a reading frame shift. Protein homology analysis using HHpred and SWISS-MODEL showed no significant similarity to any other known sequences for Myun45.

Enhancins in MyunGV-A

It was first observed in Mythimna (formerly Pseudaletia) unipuncta that GV can increase the rate of infection and fatality of NPV and decrease the larval survival time when GV and NPV coinfect larvae10. Subsequent studies found that the factor responsible for synergistic interaction is a GV protein that shows a synergistic effect only when larvae are infected with NPV; it was identified as a synergistic factor (SF)39. The synergistic effect of viral enhancing factor (VEF) was also observed in TnGV40. The location and sequence of the VEF gene of TnGV have been identified41. This enhancing protein (enhancin) can disrupt the midgut peritrophic membrane (PM), thereby resulting in the more efficient passage of virions to host midgut cells12. Enhancin was identified as a metalloprotease via the discovery of a zinc-binding site as well as by inhibition with a metal chelator and reactivation with divalent ions42.

The MyunGV-A genome has three enhancin genes (Myun157, -159 and -170). Similarly, three enhancin genes were found in MyunGV-B and TnGV, but they show large diversity in amino acid sequence identity compared to MyunGV-A. MyunGV-B enhancins are only 35% to 55% identical to that of MyunGV-A but are as high as 99% identical to that of TnGV. Four enhancin genes were found in the XcGV and HearGV genomes, of which enhancin-1, -3, and -4 have high homology (amino acid sequence identities all above 74%) to three enhancin genes of MyunGV-A. The MyunGV-A enhancin gene (enhancin-3) encoding 901 amino acids has been sequenced and characterized12. The canonical sequence HEXXH, the zinc-binding site in most metalloproteases, was found in enhancing-3 but not in the other two enhancins. It is not clear why three enhancins are present in MyunGV-A, and the roles of these three enhancins in promoting NPV infection remain unclear.

Enhancins are found mainly in GVs and a few NPVs. They are localized within the granulin matrix in granuloviruses and released to increase virus pathogenicity by acting in the midgut. In contrast, LdMNPV enhancins are located within ODV envelopes and facilitate ODVs to pass the host defence barrier by acting directly on the peritrophic membrane as the nucleocapsids move through the barrier43. However, subsequent studies have indicated that LdMNPV enhancins have a function that may assist virus-host cell fusion beyond peritrophic membrane degradation44.

Protein analysis of OBs of MyunGV-A

To date, nine baculovirus proteomic studies have been performed with the intent of revealing infectious mechanisms and virus-host interactions, as follows: six alphabaculviruses—AcMNPV45,46, BmNPV47, HearSNPV48, HearNPV-G449, AgMNPV50 and ChchNPV51; two betabaculoviruses, ClanGV52 and PrGV53; and one deltabaculovirus, CuniNPV54. In this study, we performed an analysis of MyunGV-An OB proteins. For 29 samples, 24 proteins were identified from the putative protein database of MyunGV-A (NC_013772.1) (Table 2). Among the 24 proteins, 20 were detected with two or more peptides, and the other four were detected with one matching peptide. In addition, 15 of 24 identified proteins were detected in more than one sample. Granulin was found in 28 of the 29 samples (Table S2). The same situations were found for CuniNPV54, HearsNPV48 and AgMNPV50. A noticeable phenomenon was also observed, whereby the identified proteins were not distributed according to their molecular mass in SDS-PAGE gels. The reason was postulated to be incomplete denaturation of OBs and the breakdown of protein complexes or protein processing54.

Table 2 Analysis of proteins identified from MyunGV-A.

Of the 24 identified proteins, eight are encoded by core genes, including ORF11 (ODV-e18), ORF12 (VP49), ORF14 (ODV-e56), ORF48 (PIF-7), ORF51 (ODV-ec43), ORF103 (ODV-e25), ORF115 (VP39) and ORF125 (GP41); among them, VP39 is the major capsid protein, GP41 is a tegument protein only found in ODVs and is present in the nucleocapsid and the viral envelope as a structural protein of ODVs, and four proteins, including ODV-e18, ODV-e56, ODV-e25 and ODV-ec43, are ODV envelope proteins (Table 2). An ODV envelope protein, ODV-e66, was also identified.

For the 24 identified proteins, six are encoded by additional genes conserved in GVs, including ORF16, ORF17, ORF18, ORF120, ORF174 and ORF17555. Among them, proteins encoded by two contiguous ORFs (ORF16 and 17) belong to the CpGV ORF16 L family56, and the protein encoded by ORF18 is similar to P10, containing a baculovirus polyhedron envelope protein (PEP) C domain (pfam04513). In addition to structural proteins or those implicated in DNA replication and transcription, four important auxiliary proteins were identified, including SOD, cathepsin and two enhancins. Enhancin-1 and enhancin-3 were detected in our proteomic studies; enhancin-3 was present in 16 samples, while enhancin-1 was present in only 1 sample. Most baculovirus enhancins, including MyunGV-A, are located in the OB matrix, whereas LdMNPV enhancins were found to be associated with ODV envelopes43,57. In this study, we did not attempt to determine the specific location of enhancins.

Moreover, four proteins (Myun29, Myun32, Myun44 and Myun67) with unknown functions were detected (Table 2). An increasing number of baculovirus proteomic studies can provide valuable insight into baculovirus structure, infectious mechanisms and interactions with their hosts.

Phylogenetic analysis of MyunGV-A

A phylogenetic tree based on the combined amino acid sequences of 38 core genes from 82 complete baculovirus genomes (Table S1) classified MyunGV-A into clade “a” of Betabaculovirus, which clusters infecting the larvae of the Lepidopteran family Noctuidae. Within this clade, MyunGV-A is present into a subcluster together with TnGV, the closest neighbour, sharing a common hypothetical ancestor. XcGV and HearGV form another subcluster next to the MyunGV-A and TnGV subclusters. However, MyunGV-B, another granulovirus from the same host, groups into a subcluster with SpfrGV and slightly away from MyunGV-A across MolaGV (Fig. 4). This is consistent with the above comparison results of gene organization in which MyunGV-A is similar to TnGV, XcGV and HearGV, regardless of genome size, ORF number or gene order.

Figure 4
figure 4

Phylogenetic tree of 82 baculoviruses with complete sequences. The phylogenetic tree was generated using MEGA X58 software and performed with the maximum likelihood method and JTT matrix-based model59. The result was visualized using iToL60.


The purified OBs of MyunGV-A show typical GV morphological characteristics under EM. The complete MyunGV-A (NC_013772.1) genome is 176,677 bases, with a G+C content of 39.79%, the second largest baculovirus genome to date. It contains 183 ORFs with a minimal size of 50 codons. The genome of MyunGV-A exhibits extensive sequence similarity and collinearity with TnGV, XcGV and HearGV. Three unique genes, 12 bro, 2 helicase and 3 enhancin genes, were identified. In particular, two repeated genes (ORF39 and 49) are present in the genome in reverse and complementarily orientations. Twenty-four OB proteins were identified from the putative protein database of MyunGV-A. According to our phylogenetic tree, MyunGV-A belongs to the Betabaculovirus group and is most closely related to TnGV.