Bat coronavirus phylogeography in the Western Indian Ocean

Bats provide key ecosystem services such as crop pest regulation, pollination, seed dispersal, and soil fertilization. Bats are also major hosts for biological agents responsible for zoonoses, such as coronaviruses (CoVs). The islands of the Western Indian Ocean are identified as a major biodiversity hotspot, with more than 50 bat species. In this study, we tested 1,013 bats belonging to 36 species from Mozambique, Madagascar, Mauritius, Mayotte, Reunion Island and Seychelles, based on molecular screening and partial sequencing of the RNA-dependent RNA polymerase gene. In total, 88 bats (8.7%) tested positive for coronaviruses, with higher prevalence in Mozambican bats (20.5% ± 4.9%) as compared to those sampled on islands (4.5% ± 1.5%). Phylogenetic analyses revealed a large diversity of α- and β-CoVs and a strong signal of co-evolution between CoVs and their bat host species, with limited evidence for host-switching, except for bat species sharing day roost sites. These results highlight that strong variation between islands does exist and is associated with the composition of the bat species community on each island. Future studies should investigate whether CoVs detected in these bats have a potential for spillover in other hosts.

have also been reported in fruit bats (Pteropodidae) in Madagascar, where β-coronaviruses belonging to the D-subgroup were identified in Eidolon dupreanum and Pteropus rufus 30 .
In this study, we investigated the presence of CoVs in over 1,000 individual bats belonging to 36 species, sampled on five islands (Madagascar, Mauritius, Mayotte, Reunion Island, and Mahé) and one continental area (Mozambique). Based on molecular screening and partial sequencing of the RNA-dependent RNA polymerase gene, we (i) estimated CoV prevalence in the regional bat populations, (ii) assessed CoV genetic diversity, and (iii) identified associations between bat families and CoVs, as well as potential evolutionary drivers leading to these associations.

Results
prevalence of coV. A total of 1,013 bats were tested from Mozambique, Mayotte, Reunion Island, Seychelles, Mauritius and Madagascar (Fig. 1). In total, 88 of the 1,013 bat samples tested positive for CoV by Real-Time PCR (mean detection rate: 8.7%). The prevalence of positive bats was different according to the sampling locations (χ² = 77.0, df = 5; p < 0.001), with a higher prevalence in Mozambique (±95% confidence interval: 20.5% ± 4.9%) than on all Western Indian Ocean islands (4.5% ± 1.5%) (Fig. 2). A significant difference in the prevalence of positive bats was also detected between families (χ² = 44.8, df = 8; p < 0.001; Supplementary Figure S1). The highest prevalence was observed in the families Nycteridae (28.6% ± 23.6%) and Rhinolophidae (26.2% ± 11.0%). Bat species had a significant effect on the probability of CoVs detection (χ² = 147.9, df = 39; p < 0.001; Supplementary Figure S2). Finally, the prevalence of CoV positive bats in Mozambique was significantly different RdRp sequence diversity. Of the 88 positive samples, we obtained 77 partial RdRp sequences using the Real-Time PCR detection system (179 bp) and 51 longer partial RdRp sequences using a second PCR system (440 bp). Sequences generated with the second system were subsequently used for phylogenetic analyses. Details of the sequenced CoV-positive samples are provided in Supplementary Table S1. Pairwise comparison of these 51 sequences revealed 28 unique sequences, and sequences similarities ranging from 60.2% to 99.8%. The lowest sequence similarity was found in Mozambique (60.2% to 99.8%), then in Madagascar (64.0% to 99.8%). No genetic variation was observed for samples from Mayotte and Reunion Island. phylogenetic structure of coVs. Sequence comparison indicated that Western Indian Ocean bats harbor a high diversity of both α and β-CoVs, with conserved groups clustering mostly by bat family (Fig. 3). Specifically, 25 sequences were identified as α-CoVs, and three sequences were genetically related to the β-CoVs. For α-CoVs, all sequences detected in our tested Molossidae formed a highly supported monophyletic group, including CoV sequences from Molossidae bats previously detected in continental Africa (Fig. 4) Table S2). All CoVs found in Miniopteridae clustered in a monophyletic group, including Miniopteridae CoVs sequences from Africa, Asia, and Oceania (Supplementary Table S2). The great majority of α-CoVs detected in Rhinolophidae bats clustered in two monophyletic groups (Fig. 3); one with African Rhinolophidae CoVs and one with Asian Rhinolophidae CoVs. We also detected one CoV from Rhinolophus rhodesiae, which was 100% similar to a Miniopteridae CoV from this study. Rhinonycteridae CoVs formed a single monophyletic group with NL63 Human CoVs. The Rhinonycteridae CoVs detected clustered with NL63-related bat sequences found in Triaenops afer in Kenya (Fig. 5) and showed 85% similarity to NL63 Human CoVs (Supplementary Table S2). Hipposideridae α-CoVs mainly clustered into a single monophyletic group, including 229E Human CoV-related bat sequence found in Hipposideros vittatus from Kenya ( Fig. 6; Supplementary Table S2).
For β-CoVs, two sequences obtained from Nycteris thebaica clustered in the C-subgroup together with other CoVs previously reported in African Nycteris sp. bats (Fig. 7). The sequences showed 88% nucleotide identity to a β-C CoV found in Nycteris gambiensis in Ghana (Supplementary Table S2). Rousettus madagascariensis CoVs clustered with Pteropodidae CoVs belonging to the D-subgroup of β-CoVs (Fig. 8). BLAST queries against the NCBI database showed 98% nucleotide identity between CoV sequences from Rousettus madagascariensis and a β-D CoV sequence detected in Eidolon helvum from Kenya (Supplementary Table S2). co-phylogeny between bats and coVs. Co-phylogeny tests were conducted using 11 Cyt b sequences obtained from the 11 CoVs positive bat species and 27 partial CoV RdRp sequences (440 bp). Results supported co-evolution between the Western Indian Ocean bats and their CoVs (ParaFitGlobal = 0.04; p = 0.001) and a high level of phylogenetic congruence (Fig. 9).

Discussion
We provide evidence for a high diversity of CoVs in bats on Western Indian Ocean islands. The overall prevalence of CoV positive bats was consistent with studies from continental Africa 25 and from islands in the Australasian region 31 , although we detected significant variation in the prevalence of infected bats, according to their family, species, sampling location and season. Our study is nevertheless affected by the strong heterogeneity of bat communities in the island of the Western Indian Ocean, in particular in term of species richness. The high CoV genetic diversity detected in bats from Mozambique and Madagascar is likely to be associated with the higher bat species diversity in the African mainland and in Madagascar, has compared to small oceanic islands 20 . In addition, CoV prevalence in bat populations may significantly vary across seasons, as found in Mozambique with higher prevalence during the wet season than in the dry season. Several studies on bat CoV have indeed shown significant variations in the temporal infection dynamic of CoV in bats, potentially associated with bat parturition [32][33][34] .
Host specificity is well known for some bat CoVs subgenera [35][36][37] . For example, β-C CoVs are largely associated with Vespertilionidae, whereas β-D CoVs are found mostly in Pteropodidae 36,38 . In our study, we showed that Western Indian Ocean bats harbor phylogenetically structured CoVs, of both α-CoV and β-CoV subclades, clustering mostly by bat family. In the new CoV taxonomy based on full genomes proposed by the International Committee of Taxonomy of Viruses (ICTV), α-CoVs and β-CoVs are split in subgenera mostly based on host families 39 , reflected in the subgenera names (e.g. Rhinacovirus for a Rhinolophidae α-CoV cluster, Minuacovirus for a Miniopteridae α-CoV cluster, Hibecovirus for an Hipposideridae β-CoV cluster). Although our classification was based on a partial sequence of the RdRp region, we identified sequences from samples belonging to four of these subgenera (Minuacovirus, Duvinacovirus, Rhinacovirus, and Nobecovirus) and three that could not be classified according to this taxonomic scheme hence representing unclassified subgenera (we propose "Molacovirus", "Nycbecovirus", and "Rhinacovirus2").
A strong geographical influence on CoVs diversity, with independent evolution of CoVs on each island, was expected in our study, because of spatial isolation and endemism of the tested bat species. Anthony et al. 38 found that the dominant evolutionary mechanism for African CoVs was host switching. Congruence between host and viral phylogenies however suggests a strong signal for co-evolution between Western Indian Ocean bats and their associated CoVs. Geographical influence seems to occur within bat families, as for Molossidae. Endemism resulting from geographic isolation may thus have played a role in viral diversification within bat families.
www.nature.com/scientificreports www.nature.com/scientificreports/ Although co-evolution could be the dominant mechanism in the Western Indian Ocean, host-switching may take place in certain situations. For example, in Mozambique, we found a potential Miniopteridae α-CoV in a Rhinolophidae bat co-roosting with Miniopteridae in the same cave. These host-switching events could be favored when several bat species roost in syntopy 40     www.nature.com/scientificreports www.nature.com/scientificreports/ Miniopteridae α-CoV was detected in Rhinolophidae bats 31 . These infrequent host-switching events show that spillovers can happen but suggest that viral transmission is not maintained in the receiver host species. The host-virus co-evolution might thus have resulted in strong adaptation of CoVs to each bat host species. In addition, viral factors (mutation rate, recombination propensity, replication ability in the cytoplasm, changes in the ability to bind host cells), environmental factors (climate variation, habitat degradation, decrease of bat preys), and phylogenetic relatedness of host species are also critical for the viral establishment in a novel host [41][42][43][44] . Nevertheless, apparent evidence of host switching as a dominant mechanism of CoV evolution could be an artifact of a lack of data for some potential bat hosts, leading to incomplete phylogenetic reconstructions 38 .
Several bat CoVs we identified in Rhinonycteridae and Hipposideridae from Mozambique had between 85% and 93% nucleotide sequence similarity with NL63 Human CoVs and 229E Human CoVs, respectively. These two human viruses are widely distributed in the world and associated with mild to moderate respiratory infection in humans 45 . Tao et al. established that the NL63 Human CoVs and 229E Human CoVs have a zoonotic recombinant origin from their most recent common ancestor, estimated to be about 1,000 years ago 46 . During the past decade, they were both detected in bats in Kenya, and in Ghana, Gabon, Kenya, and Zimbabwe, respectively 24,28,47,48 . Intermediate hosts are important in the spillover of CoVs, despite major knowledge gaps on the transmission routes of bat infectious agents to secondary hosts 49 . This hypothesis has been formulated for the 229E Human CoV, with an evolutionary origin in Hipposideridae bats and with camelids as intermediate hosts 48    www.nature.com/scientificreports www.nature.com/scientificreports/ unidentified intermediate host 28,50,51 . Because receptor recognition by viruses is the first essential cellular step to infect host cells, CoVs may have spilt over into humans from bats through an intermediate host possibly due to mutations on spike genes 13,28 . Further investigations of CoVs in Kenyan and Mozambican livestock and hunted animals could potentially provide information on the complete evolutionary and emergence history of these two viruses before their establishment in humans.
MERS-like CoV, with high sequence similarity (>85%) to human and camel strains of MERS-CoV, have been detected in Neoromicia capensis in South Africa and Pipistrellus cf. hesperidus in Uganda, suggesting a possible origin of camel MERS-CoV in vespertilionid bats 25,38,52 . This family has been widely studied, with 30% of all reported bat CoVs sequences from the past 20 years coming from vespertilionids 53 . No members of this family were positive for CoV in our study, which may be associated with the low number of individuals tested; additional material is needed to explore potential MERS-like CoV in the Western Indian Ocean, in particular on Madagascar.
Knowledge on bat CoV ecology and epidemiology has significantly increased during the past decade. Anthony et al. estimated that there might be at least 3,204 bat CoVs worldwide 38 ; however, direct bat-to-human transmission has not been demonstrated so far. As for most emerging zoonoses, CoV spillover and emergence may be associated to human activities and ecosystem changes such as habitat fragmentation, agricultural intensification and bushmeat consumption. The role of bats as epidemiological reservoir of infectious agents needs to be balanced with such human driven modifications on ecosystem functioning, in order to properly assess bat-borne CoV emergence risks.  Figure 6. Detail of the α-CoV clade. 229E-like CoVs generated in the study are indicated in bold. This sub-tree is a zoom on NL63 CoV clade from the tree depicted in Fig. 3. Bootstrap values >0.7 are indicated on the tree. Scale bar indicates mean number of nucleotide substitutions per site.

Statistical analysis.
We have performed Pearson χ² tests on all samples (1,013 bats) to explore the effect of (i) location, (ii) bat family, and (iii) bat species on the detection of coronavirus RNA. Two sampling campaigns, at two different season, in the same location, were available for Mozambique. We thus investigated the effect of the sampling season, between the wet (February) and dry (May) season, on CoV detection in Mozambique in 2015 (264 bats). Analyses were conducted with R v3.5.1 software 61 . phylogenetic analyses. Sequences obtained with the second PCR system 60 were edited with the Chromas Lite Software package version 2.6.4 62 . We explored CoV diversity of the sequences with pairwise identity values obtained from seqidentity function in R bio3d package v2.3-4 63 and identified the most similar CoV RdRp sequences referenced in GenBank using BLASTN 2.2.29 + . An alignment was then generated using the 51 nucleotide sequences obtained in this study and 151 reference CoV sequences representing a large diversity of hosts and geographic origins (Europe, Asia, Oceania, America and Africa), with CLC Sequence viewer 8.0 Software (CLC Bio, Aarhus, Denmark). A phylogenetic tree was obtained by maximum likelihood using MEGA Software v10.0.4 64 , with 1,000 bootstrap iterations, and with the best evolutionary model for our dataset as selected by modelgenerator v0.85 65 .
Host-virus associations were investigated using the phylogeny of Western Indian Ocean bats and their associated CoVs. Bat phylogeny was generated from an alignment of 1,030 bp of mitochondrial Cytochrome b (Cyt b) gene sequences (Supplementary Table S4), for each CoV positive bat species. Finally, bat and pruned CoV phylogenies based on each 393 bp RdRp unique sequence fragment were generated by Neighbor-Joining with 1,000 bootstrap iterations, using CLC Sequence viewer 8.0 Software (CLC Bio, Aarhus, Denmark) 66 . Phylogenetic congruence was tested to assess the significance of the coevolutionary signal between bat host species and CoVs sequences, using ParaFit with 999 permutations in the ape package v5.0 in R 3.5.1 67,68 . Tanglegram representations of the co-phylogeny were visualized using the Jane software v4.01 69 .

Data availability
DNA sequences: Genbank accessions MN183146 to MN183273.