Metagenomic sequencing characterizes a wide diversity of viruses in field mosquito samples in Nigeria

Mosquito vectors are a tremendous public health threat. One in six diseases worldwide is vector-borne transmitted mainly by mosquitoes. In the last couple of years, there have been active Yellow fever virus (YFV) outbreaks in many settings in Nigeria, and nationwide, entomological surveillance has been a significant effort geared towards understanding these outbreaks. In this study, we used a metagenomic sequencing approach to characterize viruses present in vector samples collected during various outbreaks of Yellow fever (YF) in Nigeria between 2017 and 2020. Mosquito samples were grouped into pools of 1 to 50 mosquitoes, each based on species, sex and location. Twenty-five pools of Aedes spp and one pool of Anopheles spp collected from nine states were sequenced and metagenomic analysis was carried out. We identified a wide diversity of viruses belonging to various families in this sample set. Seven different viruses detected included: Fako virus, Phasi Charoen-like virus, Verdadero virus, Chaq like-virus, Aedes aegypti totivirus, cell fusing agent virus and Tesano Aedes virus. Although there are no reports of these viruses being pathogenic, they are an understudied group in the same families and closely related to known pathogenic arboviruses. Our study highlights the power of next generation sequencing in identifying Insect specific viruses (ISVs), and provide insight into mosquito vectors virome in Nigeria.

www.nature.com/scientificreports/ outbreak and its aetiology and inform real-time public health actions, resulting in accurate and timely disease management and control 13 . A greater understanding of the virome in mosquito species in Nigeria could allow for a more accurate assessment of mosquito-borne disease risk, vector competence and mosquito management.
In the course of various YFV outbreaks in Nigeria between 2017 and 2020, we collected vector samples (mostly Aedes spp) in sites where there were active YFV cases. Next-generation sequencing (NGS) was carried out on 26 pools of 1300 mosquitoes (50 mosquitoes per pool) across nine (9) states in Nigeria using a metagenomic protocol as previously described 14 . In this paper we present our findings and discuss the implications.
The prevalence of PCLV was significantly higher compared to other viruses (P < 0.0001). Table 1 shows the distribution of the virus in the mosquito pools. The distributions were estimated as the proportion of the positive pool over the total pool analyzed for each pathogen. It is an expression used to indicate the proportion/ratio/frequency in various cohorts in a population. Aedes aegypti and Aedes albopictus were the most common mosquitoes in the study areas. They accounted for > 80% of the pools analyzed in the study. In addition, no viral genome was assembled from Aedes simpsoni, Aedes simpsoni complex and Anopheles coustani pools. Excluding these minor groups of the mosquito pools, distributions of the viruses between the two major species were similar. In Aedes aegypti: Excluding CFAV from statistical comparison, the prevalence of PCLV was significantly higher compared with other viruses (P = 0.006; Table 1). In Aedes albopictus: Excluding FKV and TeAV from statistical comparison, the prevalence of the viruses was similar (P = 0.41; Table 1). FKV and TeAV were excluded because the proportions were zero, and there will be no meaningful statistical comparison with zero value (e.g. 0 of 10).  were assembled from our sequenced data. We did not construct phylogenetic trees due to the lack of substantial genomes on NCBI for proper comparison. In which cases, only one to three genomes are available in the database. www.nature.com/scientificreports/ The percentage similarity for these viruses assembled from this study is detailed in Tables 2 and 4.

Discussion
Several studies have applied metagenomics sequencing in isolating viruses from mosquitoes in Africa [16][17][18][19] . Our study reports the first metagenomics analysis of mosquitoes in Nigeria. Not only did we detect the presence of viral reads from the samples, but all the reads were also sufficient to assemble genome sequences. We detected the presence of Fako virus, Aedes aegypti totivirus, Cell fusing agent, Tesano Aedes virus, Phasi Charoen-like phasivirus Chaq-like and Verdadero viruses, all of which are being reported for the first time in Nigeria. Chaq-like virus and Verdadero viruses reported in this study are the first reports in Africa. www.nature.com/scientificreports/ Our study also highlights the application of next generation sequencing in identifying ISVs and granting insight into the mosquito virome in Nigeria.
Aedes aegypti and Aedes albopictus were the most common mosquitoes in the study areas. They accounted for > 80% of the pools analyzed in the study. The reason is that our sampling was biased toward Aedes species as essential vectors of YFV. They are also the dominant members of their genus breeding around domestic and peri domestic areas of human habitations 20 .
We assembled four (segments 2,3,4 and 5) and a partial first segment out of the nine segments of the Fako virus; reovirus of the genus Dinovernavirus. The first report of the virus is in mosquito pools from Cameroun 16 . This virus is maintained via mosquito to mosquito transmission and might have evolved from its initial ancestor through loss of function activities. There is no report of human infection with FKV. Our genome was isolated from one Aedes aegypti pool from Edo State. Across all segments, our sequences were 96.7% similar to the first sequence from Cameroun (Table 2).
There are seven genomes (four full and three partial) for Tesano Aedes virus (TeAV) in mosquito pools. TeAV is a member of the Iflaviridae family and was first isolated from Mosquito samples in Ghana 17 . The virus has shown evidence of vertical transmission 17 . It increases the growth of Dengue virus 1 (DENV-1) in a   (Table 4). We had five complete genomes and one partial genome assembly for AaTV. This virus is a member of the Totiviridae family and a group of dsRNA viruses that infect fungi, protozoa, or invertebrates 21 . AaTV was highly distributed and present in mosquito pools from Kwara and Ebonyi States. Phylogenetic analysis of the virus genomes revealed that the Nigerian Strains clustered together in the same major clade, independent of the American/Asian/European lineages, implying continuous evolution and diversity of the virus (Fig. 2). Sequences obtained from the same state (Ebonyi State) clustered closely in a sub-clade on the tree, resulting from the localized spread of the virus among the mosquito vectors in this community.
In addition, we had one partial genome assembly of a flavivirus; Cell fusing agent virus. Many medically important arboviruses belong to the Flaviviridae family. Our sequence shares 97.8% identity with the sequence from Uganda. CFAV is the first ISV reported and named after its characteristic CPE of fusion of cells 22 . Two decades later, researchers sequenced CFAV in 1992 23 . The virus has been isolated from Aedes species in dengue endemic areas 6,24-28 . There is a close phylogenetic relationship between insect-specific (ISF) and medically important flaviviruses. This information could be valuable in understanding how ISFs enable/inhibit transmission of arboviruses in nature and their possible use as agents of biological control of vectors 2 . A study by Baidaliuk et al., 2019 evaluated how CFAV affects ZIKV and DENV-1 in vitro and vivo 29 . Their findings showed a negative correlation both in-vitro and in-vivo, indicating a decrease in transmission in both viruses due to the presence of CFAV.
Furthermore, we assembled 13 genomes for the L and M segments, as well as 12 for the S segment of Phasi Charoen-like-phasivirus (PCLV)-a bunyavirus first isolated from the Phasi Charoen district of Thailand from wild-caught Aedes aegypti larvae 30 . The phylogenetic tree based on the S segment (Nucleocapsid) and M segment (glycoprotein) displayed these segments as being in a separate cluster independent of previously detected lineages (Figs. 3 and 4). However, RdRp sequences encoded by the L segment displayed the PCLV from our study as being in the same clade as RdRp of previously detected PCLV from Aedes aegypti from Brazil in 2012 (Fig. 5). We have characterized our assemblies for the three segments of the virus (Table 3). There may be a link between PCLV and the transmission of arboviruses, e.g., Ae. albopictus cell line Aa23, persistently infected with CFAV, inhibited ZIKV replication and transmission 31 . At the same time, another study isolated PCLV from Ae. aegypti naturally infected with the Chikungunya virus (CHIKV) 32 during an arbovirus surveillance program. The relationship between PCLV and CHIKV transmission is still unknown and needs further investigation.
Partitiviruses are known to infect a vast host, including plants, fungi with some members of this genus recently discovered in arthropods 8,33,34 . In this study, three (3) Chaq-like viruses were found from Aedes spp pools. The BLASTn search results of the three sequences from our study showed high nucleotide sequence identity (98.7%) to strain CLv.PozaRica20 (MT742176.1) was isolated from Aedes aegypti in Mexico in 2020. Showing the virus may be widely disseminated in the mosquito vector. Only three sequences are available on NCBI for this virus as of 23rd of October 2021, with limited information.
Another Partivirus genome assembled in this study is the Verdadero virus (Table 4; Fig. 6). Verdadero virus was first isolated from Aedes aegypti colony from Poza Rica, Mexico. The name of the virus was derived from the Spanish word for "true" Verdadero 7 . There is no report on human infection by this virus. Generally, we did not observe any unique differences in viruses detected between male and female mosquito pools from the same collection site.
Generally, reports on the effects of ISVs on pathogenic arbovirus transmissions are controversial and conflicting from various studies and in vivo/in vitro conditions. Although ISVs cannot grow in mammalian cell lines, a study in Brazil isolated a novel insect-specific virus, Guapiaçu virus (GUAPV), from the plasma sample of a febrile person 9 . Furthermore, the discovery of ISVs within the families of pathogenic viruses provided insights into the evolution and adaptation of these groups of viruses 35 . For example, ISVs belonging to Flaviviridae and Bunyaviridae families are thought to be ancient viruses with distinct lineages that have evolved at the same time and diversified with their vector hosts 10,36,37 . Studies establishing vertical transmission 24,38 and evidence of ISVs genomic sequence integration in the genome of insect vectors 39 have supported the hypothesis. Against this background, many pathogenic arboviruses probably gained their dual-host range by an adaptive evolution process that conferred the ability to infect vertebrates to ISVs. There is limited information on the pathogenicity of ISVs to their insect host. Furthermore, ISVs were considered possible biological control agents for vectors and arboviruses of public health importance due to their characteristic lack of replication in mammalian cell lines. Given all these possible applications of ISVs, they are an exciting group of viruses for further investigations.
We could not detect YFV from this study, possibly because of our small amount of virus in the vectors that could not be properly amplified in the vectors. This could also be due to limitations in the protocol we used to construct the genomic libraries. Although we have used the same protocol to sequence and assemble YFV in human samples 13 successfully. An approach targeted at enriching the Yellow fever virus in the mosquito vectors could have resulted in potential YFV assembly.
Identifying these diverse groups of viruses (ISVs) from our study is a first step in applying local genomics capacity within the country for a holistic approach to disease outbreaks. Further investigation of the pathogenic potential of these viruses, how they enhance/inhibit transmission of circulating arboviruses in Nigeria needs to be carried out.   (9) states in Nigeria between 2017 and 2020 for metagenomic sequencing. Generally, three trapping methods were used in this study; egg, larval and adult collections. At each sampling point, the coordinates were taken using a global positioning system (GPS) gadget as described by 40 . Live adult mosquitoes collected in the field were immobilized using Ethyl Acetate. A total of 15 pools were mosquitoes collected at the immature stages and reared to the adult stage, while 11 pools were from mosquitoes collected directly as adults. These adult mosquitoes were unfed and hungry, as they were collected foraging for blood meal. All adult mosquitoes collected were morphologically identified to species level using keys of [41][42][43] and then pooled based on species and sex into 50 mosquitoes per pool. They were then introduced into well-labelled Eppendorf tubes containing RNAlater. The tubes were stored in freezers for the duration of the surveillance to keep the samples genetically intact. Immature stages (pupae and larvae) are reared to adults before pooling. A cohort of samples making 26 pools of 1,300 mosquitoes were sequenced. Twenty-five (25) pools were Aedes spp (Aedes aegypti, Aedes albopictus, Aedes luteocephalus, Aedes simpsoni, Aedes simpsoni complex) and one pool of Anopheles coustani (Supplementary Table S1). Mosquitoes were trapped by the National Arbovirus and Vectors Research Centre (NAVRC), Enugu, Enugu State, Nigeria.

RNA extraction and metagenomics sequencing.
A total of 1300 mosquitoes made into twenty-six (26) mosquito pools were sequenced based on the established unbiased protocol 14 . Briefly, Vector pools were initially homogenized in 1 ml of cooled Dulbecco's Modified Eagle Medium (DMEM) (composition-500 ml DMEM High Glucose (4.5 g/I) with l-Glutamine), 1 ml Penicillin-Streptomycin, 15 ml Fetal Calf Serum (FCS) 3% and 5 ml Amphotericin B) and 500 ml of Zirconia beads (Firma Biospec: 2.0 mm, Cat. No 1107912). The contents were macerated for 10 min on the Qiagen Tissuelyser LT followed by centrifuging at 4500×g for 15 min. According to the manufacturer's instructions, the supernatant was further used for RNA extraction using the QIAamp Viral RNA extraction kit (Qiagen, Hilden, Germany). Extracted RNA was turbo Dnased to remove contaminating DNA and cDNA synthesis was carried out according to the published protocol 13,14 . Sequencing libraries were made using the Illumina Nextera XT kit. Next generation sequencing using Illumina Miseq www.nature.com/scientificreports/