Đakrông virus, a novel mobatvirus (Hantaviridae) harbored by the Stoliczka’s Asian trident bat (Aselliscus stoliczkanus) in Vietnam

The recent discovery of genetically distinct shrew- and mole-borne viruses belonging to the newly defined family Hantaviridae (order Bunyavirales) has spurred an extended search for hantaviruses in RNAlater®-preserved lung tissues from 215 bats (order Chiroptera) representing five families (Hipposideridae, Megadermatidae, Pteropodidae, Rhinolophidae and Vespertilionidae), collected in Vietnam during 2012 to 2014. A newly identified hantavirus, designated Đakrông virus (DKGV), was detected in one of two Stoliczka’s Asian trident bats (Aselliscus stoliczkanus), from Đakrông Nature Reserve in Quảng Trị Province. Using maximum-likelihood and Bayesian methods, phylogenetic trees based on the full-length S, M and L segments showed that DKGV occupied a basal position with other mobatviruses, suggesting that primordial hantaviruses may have been hosted by ancestral bats.


Results
Hantavirus detection. Repeated attempts to detect hantavirus RNA were successful in only one of the 215 lung specimens (Supplemental Table 1), despite using PCR protocols that led to the discovery of other bat-borne hantaviruses. Sequence analysis of the full-length S-, M-and L-genomic segments showed a novel hantavirus, designated Đakrông virus (DKGV), in one of two Stoliczka's Asian trident bats (Fig. 1A), captured in Đakrông Nature Reserve (16.6091N, 106.8778E) in Quảng Trị Province (Fig. 1B), in August 2013. Sequence analysis. The overall genomic organization of DKGV was similar to that of other hantaviruses.
A nucleocapsid (N) protein of 427 amino acids was encoded by the 1,746-nucleotide S segment, beginning at position 58, and with a 405-nucleotide 3′-noncoding region (NCR). The hypothetical NSs open reading frame was not found.
A glycoprotein complex (GPC) of 1,127 amino acids was encoded by the 3,622-nucleotide M segment, starting at position 21, and with a 218-nucleotide 3′-NCR. Two potential N-linked glycosylation sites were found in the Gn at amino acid positions 344 and 396 and one in the Gc at position 924. Another possible site was present at amino acid position 133 of the Gn. For comparison, the amino acid positions of N-linked glycosylation sites of other representative hantaviruses are summarized in Table 1. Also, the highly conserved WAASA amino-acid motif was found at amino acid positions 641-645. The full-length Gn/Gc amino acid sequence similarity was highest between DKGV and LAIV (73.0%) ( Table 2).
Analysis of the 6,535-nucleotide L segment, which encoded a 2,145-amino acid RNA-dependent RNA polymerase (RdRP), showed the highly conserved A, B, C, D and E motifs, The functional constraints on the RdRP were evidenced by the overall high nucleotide and amino acid sequence similarity of 60% or more in the L segment between DKGV and other hantaviruses ( Table 2).
Comparison of the full-length S, M and L segments indicated amino acid sequence similarities ranging from 45.6% (BRNV GPC) to 81.2% (LAIV RdRP) between DKGV and representative bat-borne hantaviruses (Table 2), and showed that DKGV differed at the amino acid level by 30% or more from nearly all rodent-and shrew-borne orthohantaviruses.
Nucleocapsid and glycoprotein secondary structures. Secondary structure analysis of the N and Gn/ Gc proteins indicated similarities and differences between DKGV and representative rodent-, shrew-, mole-and bat-borne hantaviruses (Figs 2 and 3).
The DKGV N protein secondary structure, comprising 53.16% α-helices, 10.07% β-sheets and 35.13% random coils, resembled that of other hantavirus N proteins. A two-domain, primarily α-helical structure joined by a central β-pleated sheet, was observed. Although the N-terminal domain length was nearly the same, structural changes were evident in the central β-pleated sheet and adjoining C-terminal α-helical domain, according to the phylogenetic relationships of the N proteins.
That is, the N protein comprised two α-helical domains and a central β-pleated sheet (Fig. 2) irrespective of the low amino acid sequence similarity among the orthohantaviruses and thottimviruses. However, the central β-pleated sheet motif and RNA-binding region (amino acid positions 175 to 217) of DKGV differed from that of other hantaviruses, which resembled that of murid rodent-borne orthohantaviruses (Fig. 2).
The DKGV glycoprotein secondary structure, comprising 3.82% α-helices, 40.20% β-sheets and 55.99% random coils, resembled that of other hantavirus glycoproteins (Fig. 3). Also, the four transmembrane helices of the DKGV glycoprotein resembled that of other hantaviruses (Fig. 4), and the putative fusion loop (WGCNPVD) and zinc finger domain (CVVCTRECSCTEELKAHNEHCIQGSCPY CMRDLHPSQHVLTEHYKTC) were observed at residues 760-766 and 542-588, respectively. www.nature.com/scientificreports www.nature.com/scientificreports/ Hantavirus phylogeny. DKGV was distinct from other hantaviruses in phylogenetic trees, based on S-, M-and L-segment sequences using the Markov chain Monte Carlo (MCMC) Bayesian methods (Fig. 5). In all analyses, DKGV and LAIV shared a common ancestry. The basal position of mobatviruses and loanviruses in phylogenetic trees suggested that primordial hantaviruses may have been hosted by ancestral bats (Fig. 5).
Host phylogeny. Taxonomic classification of the bats was confirmed by PCR amplification and sequencing of the cytochrome b (Cyt b) and cytochrome c oxidase I (COI) genes. Morphologically indistinguishable Aselliscus stoliczkanus and Aselliscus dongbacana were identified by Cyt b (KU161558-KU161575 and MG524933-MG524935) and COI (LC406430-LC406448) gene sequence analysis. Phylogenetic analysis of bats belonging to the suborders Yinpterochiroptera and Yangochiroptera resembled that of DKGV and other bat-borne hantaviruses (Fig. 6).
Co-phylogeny of hantavirus and host. Segregation of hantaviruses according to the subfamily of their reservoir hosts was demonstrated by co-phylogeny mapping, using consensus trees based on the Gn/Gc glycoprotein and RdRP protein amino acid sequences (Fig. 7). The phylogenetic positions of DKGV and other www.nature.com/scientificreports www.nature.com/scientificreports/ mobatviruses (XSV and MAKV) mirrored the phylogenetic relationships of their Hipposideridae hosts. By contrast, the phylogenetic positions of LAIV from Taphozous melanopogon, QZNV from Rousettus amplexicaudatus and LQUV from Rhinolophus affinis were mismatched between virus and host species tanglegram.

Discussion
The Stoliczka's Asian trident bat, one of three species in the genus Aselliscus, is found throughout Southeast Asia. The Dong Bac's trident bat (Aselliscus dongbacana), a closely related species, overlaps in body size, distribution, echolocation and habitat 32 . However, we failed to detect hantavirus RNA in this latter species. As in previous studies, in which only one or two individual bats were found to be infected 7,12,13,16 , hantavirus RNA was detected in a single Stoliczka's Asian trident bat. Technical issues (such as primer mismatches, suboptimal cycling conditions, limited tissues and degraded RNA), as well as the restricted transmission and/or immune-mediated clearance of hantavirus infection in bats, are possible contributing factors.
By virtue of the N protein and Gc glycoprotein sequence divergence between DKGV and other hantaviruses, as well as the unique host species, DKGV likely represents a new hantavirus species. A three-dimensional, bi-lobed protein architecture for RNA binding was suggested by the DKGV N protein secondary structure and that of other bat-borne hantaviruses. The fusion loop and zinc finger domain of the glycoprotein suggested that DKGV and other bat-borne loanvirus and mobatviruses have similar mechanisms of protein modification with   www.nature.com/scientificreports www.nature.com/scientificreports/ rodent-borne orthohantaviruses 33,34 . On the other hand, the envelope GPC is implicated in virus attachment and cell entry. Further insights for receptor and receptor binding sites of hantaviruses harbored by shrews, moles and bats will contribute to a deeper understanding about their host specificities.
Recently, far greater genetic diversity than those in rodents have been detected in orthohantaviruses harbored by shrews and moles of multiple species, belonging to five subfamilies (Soricinae, Crocidurinae, Myosoricinae, Talpinae and Scalopinae) within the order Eulipotyphla, in Europe, Asia, Africa and/or North America. Similarly, hantavirus RNA has been detected in tissues of several bat species belonging to the families Emballonuridae, Nycteridae and Vespertilionidae (suborder Yangochiroptera) and the families Rhinolophidae, Hipposideridae and Pteropodidae (suborder Yinpterochiroptera). It is unclear to what extent spillover or host switching is responsible for the overall low prevalence of hantavirus infection in only 14 bat species to date, and the inability to detect hantavirus RNA in nearly 100 other bat species analyzed to date 7-16 .  www.nature.com/scientificreports www.nature.com/scientificreports/ Previously, the overall congruence between gene phylogenies of rodent-borne orthohantaviruses and their hosts led to the conjecture that orthohantaviruses had co-evolved with their reservoir hosts 4 . However, recent studies, based on co-phylogenetic reconciliation and estimation of evolutionary rates and divergence times, conclude that local host-specific adaptation and preferential host switching account for the phylogenetic similarities between hantaviruses and their mammalian hosts 35,36 . Although our co-phylogenetic analysis also indicates that bat-borne loanvirus and mobatviruses and their host species have not co-diverged, the availability of whole genome sequences for only four of the 10 bat-borne hantaviruses presented a significant limitation. Thus, future efforts must focus on obtaining full-length genomes of newfound hantaviruses, particularly those harbored by bats in southeast Asia and Africa, to gain additional insights into the phylogeography and evolutionary origins of viruses in the family Hantaviridae.   Table 1). RNA was reverse transcribed to cDNA was synthesized using the PrimeScript II 1st strand cDNA Synthesis Kit (Takara bio, Otsu, Japan) and an oligonucleotide primer (OSM55F, 5′-TAGTAGTAGACTCC-3′) designed from conserved 5′-end of the S, M and L segments of hantaviruses 39 .

RT-PCR and sequencing.
Oligonucleotide primers previously used to detect hantaviruses 12,13,16,[39][40][41][42][43] were employed to amplify S, M and L segments (Supplemental Table 2). First-round PCR was performed in 20-μL reaction mixtures, containing 250 μM dNTP, 2.5 mM MgCl 2 , 1 U of Takara LA Taq polymerase Host Start version (Takara Bio) and 0.25 μM of each primer 16 . Second-round PCR was performed in 20-μL reaction mixtures, containing 200 μM dNTP, 2.5 mM MgCl 2 , 1 U of AmpliTaq 360 Gold polymerase (Life Technologies, Foster City, CA, USA) and 0.25 μM of each primer 16 . Initial denaturation at 95 °C for 1 min for first PCR or at 95 °C for 10 min for second PCR were followed by two cycles each of denaturation at 95 °C for 20 s, two-degree step-down annealing from 46 °C to 38 °C for 40 s, and elongation at 68 °C for 1 min, then 30 cycles of denaturation at 94 °C for 20 s, annealing at 42 °C for 40 s, and elongation at 68 °C for 1 min, in a Veriti thermal cycler (Life Technologies) 7,12,13,16,39 . PCR products, treated with ExoSAP-IT (Affymetrix, Santa Clara, CA, USA) according to the manufacturer's instruction, were sequenced directly using an ABI 3730xl DNA Analyzer (Life Technologies) 16 . www.nature.com/scientificreports www.nature.com/scientificreports/ www.nature.com/scientificreports www.nature.com/scientificreports/ Secondary structure analysis. Full-length amino acid sequences were submitted to NPS@ structure server 44 to predict secondary structures of the N protein and Gn/Gc glycoproteins 36 . Glycosylation and transmembrane sites were predicted at the NetNlyc 1.0 and Predictprotein 45 and TMHMM version 2.0 46 , respectively. The program COILS 47 was used to scan the N protein for expected coiled-coil regions 36 . Phylogenetic analysis. Maximum likelihood and Bayesian methods, implemented in RAxML Blackbox webserver 48 and MrBayes 3.1 49 , under the best-fit GTR + I + Γ model of evolution and jModelTest version 2.1.6 50 , were used to generate phylogenetic trees 13 . Two replicate Bayesian Metropolis-Hastings MCMC runs, each consisting of six chains of 10 million generations sampled every 100 generations with a burn-in of 25,000 (25%), resulted in 150,000 trees overall 13,16 . The S-, M-and L-genomic segments were treated separately in the phylogenetic analyses. Topologies were evaluated by bootstrap analysis of 1,000 iterations, and posterior node probabilities were based on 10 million generations and estimated sample sizes over 100 (implemented in MrBayes) 16 . With a robust phylogeny of shrew-, mole-, bat-and rodent-borne hantaviruses 10 , we readdressed the co-evolutionary relationship between hantaviruses and their hosts that formed the basis of our predictive paradigm for hantavirus discovery, by comparing the degree of concordance between reservoir host and hantavirus cladograms in TreeMap 3β1243 51 .   Table 3. www.nature.com/scientificreports www.nature.com/scientificreports/