Identification of a Novel Enterovirus Species in Rhesus Macaque in China

Recent studies of Enterovirus (EV) in nonhuman primates (NHPs), which could act as a source of future emerging human viral diseases, have boosted interest in the search for novel EVs. Here, a highly divergent strain of EV, tentatively named SEV-gx, was identified by viral metagenomic analysis from stool samples of rhesus macaques in China. In total, 27 of 280 (9.6%) faecal samples from rhesus macaques were positive for SEV-gx. Its complete genomic sequence is 7,367 nucleotide (nt). Genomic analyses showed that it has a standard genomic organisation for EVs, being more closely related to EV-J strains (approximately 54.0%, 43.0–44.1%, 52.3–55.2%, 61.1–62.7% and 64.0% amino acids identity in polyprotein, P1, P2 and P3 and combined 2C/3CD regions, respectively). It was also shown to have genome characteristics typical of EVs. Phylogenetic analysis of P1, 2C and 3CD aa indicated that SEV-gx can be classified as a distinct cluster in the EVs. All of this evidence demonstrates SEV-gx is a novel species (tentatively named EV-K) in the EV genus, which contributes to our understanding of the genetic diversity and evolution of EVs. Further studies are needed to investigate the potential pathogenicity of SEV-gx in NHPs and humans.

Genomic characterisation. Our analysis showed that the assembled complete genome sequence of SEV-gx is 7,367 bp in length, excluding the poly(A) tail. The G + C content is 43.0%, which is within the range of those in the EV genus (from 41.0% to 49.0%). The genome organisation of SEV-gx is similar to those of other EVs. A single ORF of 6,630 nt, encoding a polyprotein of 2,209 aa, was found to be flanked by a 5′ -UTR of 660 nt and a 3′ -UTR of 77 nt. The full polyprotein includes structural protein P1 of 858 aa (VP4, VP2, VP3 and VP1) and nonstructural proteins P2 of 580 aa (2A, 2B and 2C) and P3 of 771 aa (3A, 3B, 3C pro and 3D pol ), which shows about 54.0% aa identity with EV-J. The cleavage sites within the polyprotein were predicted to be mainly Gln (Q)/Gly (G) typically processed by 3C pro or 3CD pro , except for the cleavage sites at the junction of VP1/VP2 and VP1/2A, which were Lys (K)/Ser (S) and Glu (E)/Gly (G), respectively (Fig. 1), similar to those of picornaviruses.The GenBank accession number for the sequence of SEV-gx is KU587555.
Genomic analyses. The 5′ -UTR of SEV-gx shares 38.4-57.1% nucleotide sequence identity with those of strains of other EV species (about 57.1% identity to EV J), but this similarity is much less to those of other picornaviruses ( Table 1). The predicted RNA secondary structure of the complete 5′-UTR contained seven domains. Similar to the type I internal ribosome entry site (IRES) of EVs 3 , Domain I (nt 1-77) formed a cloverleaf structure, while five domains, II to VI, were the main domains of IRES element (Fig. 2), which are critical for the initiation of translation in a cap-independent manner 23 . Between domains I and II, there is an additional stem-loop (domain Is), which is also observed in bovine and dromedary EVs 3 . The putative translation initiation site (AUG) of SEV-gx was predicted at the site of nt 661, which was contained in the optimal Kozak context 24 , A 658 AUAUGG. Upstream of the AUG start codon, there was a Yn-Xm-AUG motif (nt 586-610), similar to EVs with type I IRES whose Yn-Xm-AUG motif 25 is located about 30-150-nt upstream of the AUG initiation codon. Based on these findings, SEV-gx IRES can be classified as a type I IRES. There was also no L protein in the polyprotein of SEV-gx, which is similar to previous findings in EVs. The 3′ -UTR has no significant similarity to those of other picornaviruses, except 40.2-46.0% nt identity to the strains of EV-C ( Table 1) that is within the range of identity between species in EVs (< 62.0%) 14 . The P1 aa sequence of SEV-gx shares 32.2-44.3% nucleotide sequence identity with those of other EV genus strains (the best match being to species EV-J), but this similarity is much less to other picornaviruses (Table 1). A GxxxS/T (G 1 ASVS) myristoylation site was found in the N-terminus of VP0. VP1 of SEV-gx showed the most divergence and included many insertions and deletions (data not shown) in comparison with other picornaviruses, being no more than 32.3% identical at the aa level to those of other EV strains (Table 1). Despite the poor similarity to other strains, the VP1 of SEV-gx also possessed the conserved motifs of PAL(QT)A(AV)ETG and M(FIY)VPPG (P 589 GLNAQETG and M 707 YVPPG motif) found in EV.
The P2 aa sequence of SEV-gx exhibited 52.3-55.2% identity to those of other EV species (Table 1). Although the 2A protein of picornaviruses is a highly variable region (25.3-52.9% aa identity to EVs), a putative catalytic triad of His-Asp-Cys (H 908 D 937 C 969 ) identified in EVs was also found 26 . In addition, the conserved GXCG motif (G 968 DCG) was found and formed part of the active site of the protease, suggesting that the 2A protein may function as a viral protease. Like those of other EVs, there were no Asn-Pro-Gly-Pro (NPGP) motifs in 2A and 2B, which are required for co-translational cleavage in avihepatoviruses and avisiviruses 27 . The cysteine-rich region (CX 2 CX 8 CX 4 C) was identified in the SEV-gx 2C protein, identical to the PV1 2C motif, which is used to bind zinc and plays an important role in RNA replication 28 . Similar to all of the other picornaviruses, the NTP binding motif GXXGXGKS (G 1239 SPGSGKS) and the helicase activity motif DDLXQ (D 1286 DLGQ) were also found in the 2C protein 29 .
The identity of the P3 aa sequence of SEV-gx to those of other EV species was 61.1-62.7%, while it showed less than 42.4% aa identity to other picornaviruses. A putative catalytic triad of His-Glu-Cys (H 1588 -E 1619 -C 1696 ) was   seen in the predicted 3C pro (protease), similar to those in EVs, which differed from those (His-Asp-Cys) in some other genera of picornaviruses. The conserved GXCG (G 1694 QCG) motif was also found in 3C pro of SEV-gx, which formed part of the active site of the viral protease. Similar to those in the EV genus, the putative RNA-binding domain (K 1630 FRDI) 30 and the motifs of RNA-dependent RNA polymerases (K 1906 DELR, G 2032 GMPSG, Y 2074 GDD and F 2121 LKR) 31 were also identified in the 3D pol of SEV-gx.
Phylogenetic analysis. Phylogenetic trees were constructed based on the complete aa sequences of P1, 2C and 3CD of SEV-gx and other representative picornaviruses. In the P1 region, SEV-gx formed a single monophyletic tree related to the cluster of picornaviruses including EV, Rabovirus, Sapelovirus and avian picornaviruses, which was close to the root between the genera EV and Rabovirus, with 100% bootstrap support (Fig. 3a). In the conserved 2C region, SEV-gx formed an independent tree between human rhinovirus species B and C in the EV genus, showing it to be most closely related to human rhinovirus species B (Fig. 3b). However, in the conserved 3CD region, SEV-gx formed a monophyletic tree between human rhinoviruses and EVs, showing it to be relatively closely related to human rhinoviruses, with 89% bootstrap support (Fig. 3c).

Discussion
EVs are one of the most common viruses infecting humans. Infection of humans and animals by EVs is usually asymptomatic, but some EVs cause severe and occasionally fatal diseases in humans and animals 32 . Interestingly, a growing number of studies have reported that some EVs (EV-A76, EV-D111 and EV-A119) co-circulate in both humans and NHPs [33][34][35][36] and suggesting that wild NHPs could act as EV reservoirs and sources of future emerging EVs 37 . As such, it is very important to document the presence of divergent EVs among NHPs. However, EVs infecting Chinese NHPs are still rarely reported. Here, we report a novel EV, identified in faecal samples from Macaca mulatta at LongHu Mountain in Guangxi Province, China, by Miseq high-throughput sequencing. To the best of our knowledge, this is the first study on the identification of EVs from Chinese wild NHPs. Genomic characterisation analysis showed that SEV-gx contains a type I IRES and has no L protein. VP0 protein is assumed to be cleaved into VP4 and VP2 at the cleavage site of Lys/Ser. The VP1 protein possesses the PAL(QT)A(AV)ETG and M(FIY)VPPG motifs. 2A protein has the catalytic triad of His-Asp-Cys and the conserved CXCG motif that makes it function as a protease to cleave VP1/2A. 2C protein has the NTPase motif GXXGXGKS and the helicase motif DDLXQ. 3C pro has the catalytic triad of His-Glu-Cys and conserved CXCG motifs, which enables it function as a chymotrypsin-like protease. 3D pol has the RNA-binding domain (KFRDI) and the conserved motifs of RNA-dependent RNA polymerases. All of these findings demonstrate that the structural features of SEV-gx are similar to those of members of the EV genus in the family Picornaviridae.
The Picornaviridae Study Group (PSG) guidelines state that members of a species of the genus should share < 70.0% aa identity in the polyprotein, < 60.0% aa identity in P1 and < 70.0% aa identity in 2C + 3CD, while members of different genera should share less than 40.0%, 40.0% and 50.0% aa in P1, P2 and P3, respectively. From the sequence alignment of SEV-gx, we know that it is most closely related to those of species EV-J (simian EVs) (43.0-44.1% aa identity in P1, 52.3-55.2% aa identity in P2, 61.1-62.7% aa identity in P3 and less than 64.0% aa identity in 2C/3CD combined), which meets the criteria for a different species in the EV genus. Phylogenetic analysis of P1, 2C and 3CD between SEV-gx and other picornaviruses showed that SEV-gx formed a monophyletic tree in the genus EV. Therefore, SEV-gx should be classified as a member of a distinct species (tentatively named EV-K) in the genus EV.
SEV-gx was detected in approximately 10.0% of Macaca mulatta stool samples, suggesting that it is common in the local rhesus macaque population. The 100.0% nt identity of all of the SEV-gx VP1 sequences showed that the virus is stably circulating locally. Except for SEV-gx, the viral metagenomic analysis in this study did not show any other EV-related viruses in M. mulatta, which contrasts with recent reports about the discovery of widely diverse EVs infecting monkeys in the wild 18,35 . From the phylogenetic analysis of the P1 region, SEV-gx was somewhat distinct from other known EVs and close to the raboviruses 38 and sapeloviruses, which implied that SEV-gx may be a special evolutionary intermediate between EVs and raboviruses/sapeloviruses. Meanwhile, the relatively large distance of SEV-gx from other EVs also suggested that the diversity of EVs in NHPs would be much broader than previously recognised. However, the phylogenetic analysis of the 2C/3CD region was not consistent with that for P1, and showed that SEV-gx was most closely related to human rhinoviruses, which indicated that SEV-gx and human rhinoviruses may share a common SEV-gx-like ancestor. Since the 2C/3CD region is conserved across all of the picornaviruses, it will most likely reflect the true phylogeny between SEV-gx and other picornaviruses. These findings about the special genome of SEV-gx should facilitate future research about the evolution of EVs.
Previous studies have suggested that the transmission of some EVs from wild NHPs to humans may have occurred recently [33][34][35][36] . In fact, NHPs have been indicated as a virus reservoir and have spread some important viral pathogens to humans, including Ebola/Marburg viruses and human immunodeficiency virus 39,40 . Thus, it would be reasonable to expect that wild NHPs could play the role of a reservoir and/or source of future emerging EVs with unpredictable symptomatology in humans. The spread of SEV-gx from local NHPs to adjacent habitats can also not be ruled out. Therefore, it is important to perform further studies, including serological assays, on humans in local areas to predict their risks of infection. Furthermore, more studies on virus species in wild animals are necessary to prevent and control human viral diseases, in view of the major environmental changes currently faced by wild animals and humans.

Materials and Methods
Specimens. A total of 280 faecal samples were randomly collected from Macaca mulattaat LongHu Mountain in Guangxi Province, China, from January to May, 2014. All samples were transported to our lab on dry ice and stored at − 80 °C until further analysis.
Sample extraction and high-throughput sequencing. Total nucleic acids were extracted from the 280 samples that were diluted with phosphate-buffered saline (PBS) (1:10 w/v ratio) and passed through 0.45-μ m and 0.22-μ m filters, using a QIAamp Viral Mini Kit (Qiagen, Hilden, Germany), in accordance with the manufacturer's instructions. Viral nucleic acid libraries were constructed by sequence-independent random RT-PCR amplification. Then, the PCR products were sequenced using an Illumina Miseq 2500 platform (Illumina, San Diego, CA, USA). Initial sequencing data were analysed using the customised informatics pipeline Virus Hunter, as described previously 41 .
Detection of SEV-gx and amplification of the complete VP1 region. The presence of SEV-gx was confirmed in 280 faecal samples from Macaca mulatta by reverse transcription nested PCR (RT-nested PCR) with PrimeScript One Step RT-PCR Kit (Takara, Tokyo, Japan), based on the sequences obtained by Miseq sequencing. Complete VP1 sequences were amplified for the SEV-gx-positive samples, using PrimeScript One Step RT-PCR Kit (Takara). All of the amplifications were achieved under the following conditions: 50 °C for 30 min, and 94 °C for 5 min, followed by 35 cycles (94 °C for 30 s, 53 °C for 30 s and 72 °C for 1 min) and then 72 °C for 7 min. The RT-PCR products were electrophoresed and purified on a 1.5% agarose gel. The sequences were determined using the Big-Dye terminator cycle sequencing kit and the ABI Prism 310 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). All of the primers used are listed in Table S1. Whole-genome sequencing. To obtain the complete genome sequence of SEV-gx, the genome-walking Kit (Takara) was used to amplify the unknown sequences, and the full terminal sequences were determined by repeated amplification and sequencing using the 5′ and 3′ Rapid Amplification of cDNA Amplification Kit (Clontech, Mountain View, CA, USA), in accordance with the manufacturer's instructions. The specific primers used here were based on the obtained contig and newly amplified sequence. Three long overlapping fragments were amplified to confirm the final genomic sequence using LA-Taq DNA polymerase (Takara). All primers used here are shown in Table S1. Sequence analysis. The sequence of SEV-gx was analysed by sequence alignment with other EVs and representative picornavirus sequences, using Clustalx (ver. 1.83). Pairwise nt and aa identities between SEV-gx and other picornaviruses were calculated using DNAMAN software. Cleavage sites of SEV-gx were predicted based on the alignment of EVs and other picornavirus sequences.
RNA structure prediction of the 5′-UTR. The 5′ -UTR RNA secondary structure of SEV-gx was predicted using consecutive fragments of the complete nt sequence of the 5′ -UTR and a thermodynamic folding energy minimisation algorithm with RNA structure software (ver. 5.3). The graph was integrated using RnaViz software (ver. 2.0.3).

Phylogenetic analysis.
To determine the phylogenetic relationship of SEV-gx, the aa sequences of the P1 and conserved 2C/3CD regions were aligned between SEV-gx and other EVs and picornavirus strains using Clustalx (ver. 1.83), and MEGA 4.0 software was then used to construct phylogenetic relationships by the neighbour-joining method with datasets of 1,000 replicates.