Introduction

Honey bees (Apis mellifera) are important pollinators with high economic value and ecosystem importance,1,2,3 and are exposed to confined environments, and several factors threaten their health, including various pathogens, parasites, and chemicals used as pesticides in agriculture4,5,6. The global decline of this important pollinator poses a threat to food security and biodiversity conservation7. The composition of the honey bee’s normal or altered microbiota, for which the available knowledge is limited, may also affect their body function. Although there are studies on honey bee gut microbiota8,9,10, there is little evidence on the environmental factors that influence it. Some results show that seasonal and environmental factors can influence the composition of the gut bacteriome in honeybees.11,12,13,14,15. The honey bee gut bacteriome has long been known to be composed of a few core bacterial species11,16,17. However, beyond the bacteriome, the viral composition of the microbiome is increasingly gaining more attention in bees18, as in humans19,20. Most of these studies understandably focus on bacteriophages as they play an essential role in shaping the composition of the bacteriome18,19,20,21. However, the viruses of the honey bee itself might be just as important. Especially considering those found in the gut and feces, which can contribute to their spread22,23,24. Many of these viruses are important pathogens22,23, while others, such as the Apis mellifera filamentous virus (AmFV), are little to not pathogenic to bees25,26. AmFV is the most significant DNA virus of the honey bee25,26 and for a long time was the only known one27. This property of the virus is important because it allows us to investigate the relationship between the bacteriome and AmFV in metagenomic studies targeting bacteria. Despite the fact that the role of AmFV in various diseases is still uncertain, a better understanding of its ecology can bring us closer to understanding its role.

In 2019, a nationwide, climatically balanced survey was conducted in Hungary to investigate changes in the gut microbiome of honey bees based on next-generation sequencing (NGS). The bacteriome results of the study were published by Papp et al. (2022)15. In the course of the analyses, we found a large number of short reads originating from AmFV. In this paper, we present the results of a detailed analysis of the AmFV sequences found in the survey.

Methods

Sample collection and preparation

Details of the design and conduct of the sampling can be found in the materials and methods section of Papp et al. (2022)15, here, we summarise only the methodological details necessary for interpreting the results. In 2019, a country-wide sample collection was performed twice during the honey-producing season, at its onset (March) and the peak (May). A total of 20 sampled apiaries (Fig 1) were selected to obtain a representative sample according to their climatic environment. The climatic environment was characterized by yearly growing degree days (GDD) and the yearly total precipitation. We defined the two categories for our environmental variables as cooler-warmer and less-more for GDD and precipitation, respectively. From individuals collected from three families per apiary, the gastrointestinal tracts of 10 healthy workers per family were removed for sequencing. Paired-end reads were generated from pooled samples per apiary using an Illumina NextSeq sequencer.

Figure 1
figure 1

Sampling points in Hungary. The green dots indicate migrating apiaries in May. The blue dots indicate migratory apiaries at the time of sampling in March, as well as non-migratory apiaries. The inset map shows the study region, Hungary in Europe, colored yellow. Neighboring countries are presented by ISO 3 character codes: Austria (AUT), Croatia (HRV), Romania (ROU), Serbia (SRB), Slovakia (SVK), Slovenia (SVN), Ukraine (UKR).

Bioinformatic and statistical analysis

Quality-based filtering and trimming were performed by Adapterremoval28 using 20 as the quality threshold and only retaining reads longer than 50 bp. The remaining reads were taxonomically classified using Kraken2 (k=35)29 with the NCBI non-redundant nucleotide database30. The taxon classification data was managed using functions of package phyloseq31 and microbiome32. The abundance differences were analyzed by the DESeq2 package33. Analyzing the seasonal effect, a mixed-effect model was applied to handle the repeated measures by apiary as a random factor. The SparCC correlation coefficient quantified the relationship between the relative abundances of core microbiome species and AmFV.34,35 The core microbiome was defined with a relative abundance on species-level above 0.5% in at least one of the samples. The statistical tests were two-sided, and p-values less than 0.05 were considered significant. The cleaned reads were aligned to the AmFV genome (KR819915.2) by Bowtie236 with the very-sensitive-local setting. De novo assembly was carried out using MEGAHIT (v1.2.9)37, polishing of the contigs was performed with POLCA (v4.1.0)38, and scaffolds were created by RagTag (v2.1.0)39 using the AmFV genome (KR819915.2)25. The average nucleotide identity (ANI) of scaffolds compared to the genome KR819915.2 was estimated by pyani (v0.2.12).40 For the genome annotation Prokka (v1.14.6)41 was used guided by the genome KR819915.2. Predicted protein homology analysis was performed using the NCBI BLASTP (v2.14.0)42 algorithm with a minimum e-value of 1.0e-5 on two reference genomes (KR819915.2, OK392616.1). Phylogenetic analysis was performed based on the amino acid sequences of the DNA polymerase gene. The gene-tree was constructed43 based on multiple sequence alignment by MAFFT (v7.490)44. The best substitution model was selected by functions of phangorn package45 based on the Bayesian information criterion. The generated neighbor-joining tree was optimized by the maximum likelihood method. Bootstrap values were produced by 100 iterations. All data processing and plotting were done in the R-environment46.

Results

The shotgun sequencing generated paired-end read counts of samples are ranging between 311,931 and 546,924, with a mean of 413,629. The OTU table, created by Kraken2 taxonomic classification, contained counts of samples ranging between 175,576 and 314,586 with a median of 262,292. The minimum, maximum, and median read counts of the samples assigned as viral-originated were 443, 72,010, and 1,074, respectively.

The viral hits were dominated by reads matching the genome of AmFV. All of the samples contained reads from this species, and their relative abundance per sample is summarised in Fig 2. AmFV-derived reads were more abundant in the March samples than in the May samples (fold change (FC): 5.53, 95%CI: 2.38-12.84, p<0.001). In March, a higher abundance of AmFV-originated reads was identified in samples collected from warmer areas compared to those collected from cooler areas (FC: 26.05, 95%CI: 7.31-92.81, p<0.001). A lower proportion of AmFV-derived reads were identified in samples collected in March from the wetter areas than in those collected from the drier areas (FC: 0.33, 95%CI: 0.13-0.8, p=0.014). The level of AmFV-reads found in samples collected in May showed no significant differences between groups based on either environmental temperature or precipitation (FC: 1.44, 95%CI: 0.61-3.39, p=0.40; FC: 1.85, 95%CI: 0.78-4.37, p=0.16).

From the core microbiome species Bartonella apihabitans (r=-0.384, p<0.001), Bartonella choladocola (r=-0.341, p<0.001), Frischella perrara (r=0.549, p<0.001), Gilliamella apicola (r=0.513, p<0.001), Gilliamella sp. ESL0443 (r=0.502, p=0.004), Lactobacillus apis (r=0.532, p<0.001), Lactobacillus kullabergensis (r=0.515, p=0.002), Lactobacillus sp. IBH004 (r=0.399, p<0.001), Snodgrassella alvi (r=0.485, p=0.028) showed a significant correlation with AmFV.

Figure 2
figure 2

Relative abundance of Apis mellifera filamentous virus originated reads for the first (March) and second (May) sampling. The environmental condition, growing degree-day (GDD), and precipitation categories of sampling sites are also marked.

De novo assembly of four samples (apiary ID 10, 16, 18 in March and 13 in May) resulted in almost the complete AmFV genome. For these samples, Table 1 shows the coverage and depth of the reads over the reference genome (KR819915.2) and statistics on the scaffolds and the ORFs identified within them.

Table 1 Alignment and assembly statistics. Columns 2 and 3 describe the coverage of reads on the reference genome and average depth. The lengths of the scaffolds created by the de novo assembly and the proportion of gaps within them are shown in columns 5 and 6. Column 7 shows the average nucleotide identity (ANI) of scaffolds estimated for the reference genome (KR819915.2). The last two columns show how many ORFs were predicted in the scaffolds generated and how many of these ORFs were predicted to have a protein product that matched a CDS product of the reference genome.

Table 2 summarises the predicted proteins in the scaffolds generated from our samples that can be linked to products with predicted functions in the KR819915.2 and/or OK392616.1 genomes. Figure 3 shows the gene-tree based on the amino acid sequences of the DNA polymerase gene with the best substitution model JTT+I.

Table 2 ORF homologs correspond to genes with predicted functions in the KR819915.2 and/or OK392616.1 genomes. The last four columns show the coverage and sequence identity of the product of the ORFs predicted in our scaffolds that match the product of the reference genome (KR819915.2) with the highest similarity.
Figure 3
figure 3

Gene-tree based on DNA-polymerase amino acid sequences. The genes WOK43731.1, WOK43197.1 and WLJ60241.1 were assembled from the samples of apiary 10 and 16 in March and apiary 13 in May, respectively. Most of the sequences derived from Apis mellifera, with some from other hosts: Apis andreniformis (QCC30343.1), Apis cerana (UZV41744.1), Apis cerana cerana (QAU54355.1), bumblebee (QAU54357.1), Galleria mellonella (QDD55980.1).

Discussion

Even though samples presented in this study were taken from healthy bee specimens, AmFV was detected in all samples. This is in line with our current knowledge of the virus, according to which it is only pathogenic in acute cases and/or if the bee colony is under stress26,47,48. The virus is otherwise commonly prevalent, or even endemic in bee colonies, and besides its very likely oral-fecal transmission route, can possibly be spread transovarially from the queen to the workers25.

In contrast to other studies, we have observed a decline in AmFV abundances as the honey-producing season advanced. Bailey et al. observed an increasing prevalence of infected colonies from the beginning of autumn until the end of spring, which was followed by a steep decline49. Similarly, Hartmann and colleagues observed a significant difference in AmFV loads between autumn and springtime26. It is possible that regional climatic differences might influence the dynamics of viral loads. Further, large-scale surveys are necessary, however, to confirm such patterns.

Furthermore, the relationship of AmFV with the other components of the microbiome can provide us useful information. We found a significant positive correlation with AmFV for several bacteria. In some of these, the abundance of the bacterial species also decreased as the season progressed, as we have observed previously for L. apis and L. kullabergensis15. For other species, such as F. perrara, G. apicola and S. alvi, we have not observed a seasonal pattern in our previous analysis15. Among these, F. perrara is especially of interest, which is known to stimulate the immune system of the honey bees strongly50. This species, as a potential opportunistic pathogen51, may act as a stressor on the bees, thus explaining the observed positive correlation with AmFV. Significant negative correlation was only found for two newly described Bartonella species, B. apihabitans and B. choladocola52. Regarding the only previously known Bartonella species in bees, B. apis, our previous analysis of the same samples found an increase in abundance of this bacterial species with the progression of the season15. This might suggest that even these two newly described species may have shown similar changes, which may explain the negative correlation.

The reason for the seasonal peaks of AmFV or the association between the virus and other members of the honey bee microbiome is yet unknown. Furthermore, the effect of annually higher AmFV abundances on the immune system of bees needs to be determined. Moreover, other studies found correlations in the number of AmFV and other RNA viruses, such as Deformed Wing Virus (DWV) and Black Queen Cell Virus (BQCV)26,53, indicating the benefit that could be achieved if the microbiome could be analyzed as a whole.

Our results confirm the high prevalence of the virus, as previous studies suggested26,49. There were, however, 3 March samples (sample numbers 10, 16, and 18) in which we found an exceptionally high abundance of AmFV. This may be somewhat related to our observations on seasonality, as the virus was not present in such high numbers in the corresponding May samples. Accordingly, the abundance of the virus decreased as the season progressed. In contrast, however, sample 13 was dominated by AmFV in May and, therefore, may be of further interest. The spike in virus abundance may indicate the presence of some stressor on the colony, which has not yet caused observable symptoms in the animals. Bee viruses have been described, for example, to increase in abundance in response to certain pesticides in a concentration-dependent manner54,55, which is thought to be due to the immunomodulatory effect of pesticides54 If indeed the emergence of stressors could be associated with such a fluctuation in viral abundance, it could also suggest the use of the virus as a health indicator and contribute to the diagnosis of various complex disease processes.

Besides their direct effect on honey bees, the spread of various microorganisms between A. mellifera or other bee species of economic importance and wild bees can be of particular economic and ecological significance56. The spillover of pathogens between different arthropod species has been observed for several microorganisms related to A. mellifera57, among which AmFV is no exception. It has been detected in several bee species and also in wasps, flies and beetles56,57,58,59. However, it is important to note that according to our current knowledge, AmFV does not show significant pathogenic effect on A. mellifera nor at the colony nor the individual level26, which does not necessarily mean that solitary wild bee species do not show susceptibility to the virus as well. Indeed, it is possible that these species may show increased sensitivity to certain adverse effects due to their solitary lifestyle, as it has already been shown for the effect of pesticides60. Furthermore, the positive association of the virus with BQCV26, which has also been detected in wild bee species57, may contribute to the potential harm caused by AmFV. Therefore it might be assumed that the ecological and economic importance of AmFV can be significantly determined by the impact the virus may have on wild bee species and their importance in pollination services61.

Based on the phylogenetic analysis, the sample from apiary 16, collected in March from the southwest of Hungary, had the highest similarity to the Swiss reference genome. Both other samples (apiary ID 10 March and 13 May) that contained the DNA-polymerase gene in full length were derived from Eastern Hungary. Even though sample 13 from May is relatively closer to the west of the country due to colony migration, it is originally from Eastern Hungary, and the permanent beekeeping premises are 76 air kilometers away from March sampling point 10. Accordingly, it can be supposed that the viral strain from sample 13 collected in May has already been present at the overwintering location and migrated to the May sampling point. It can be assumed that the mediating effect of the imports of honey and propolis from Hungary to Switzerland is responsible for the close genetic relationship of the Hungarian strains of AmFV.

More than one hundred ORFs were identified in each of four of our scaffolds that did not show any similarity to the reference genome with the homology assumptions. In a previous study, Yang and colleagues (2022)62 predicted the functionality of newly detected ORFs from AmFV isolates. As our sequences were of metagenomic origins, the predicted products may derive from organisms other than AmFV despite our sequences’ high similarity to the reference sequence. Accordingly, we have presented only those products that appeared in any of the two whole genomes.

In the present study, we report 4 new assemblies of the AmFV from Hungary. Our results may provide deeper insights into the genome organization of this virus. Furthermore, our results suggest a seasonal trend of the virus abundance, i.e., there is a decrease in it in the gastrointestinal tract of bees as the production season progresses. The high abundance of the virus in some bee colonies and its association with certain members of the microbiome, in particular F. perrara, may suggest a link between the virus and various stressors. It may, therefore, be important either as an agent of complex disease processes or as an indicator of the health status of the hive.