Identification and molecular characterization of the first complete genome sequence of Human Parechovirus type 15

Using a metagenomics approach, we have determined the first full-length genome sequence of a human parechovirus type 15 (HPeV15) strain, isolated from a child with acute flaccid paralysis and co-infected with EV-A71. HPeV15 is a rarely reported type. To date, no full-length genome sequence of HPeV15 is available in the GenBank database, where only limited VP1 sequences of this virus are available. Pairwise comparisons of the complete VP1 nucleotide and deduced amino acid sequences revealed that the study strain belongs to type 15 as it displayed 79.6% nucleotide and 93.4% amino acid identity with the HPeV15 prototype strain. Comparative analysis of available genomic regions and phylogenetic analysis using the P2 and P3 coding regions revealed low nucleotide identity to HPeV reference genomes. Phylogenetic and similarity plot analyses showed that genomic recombination events might have occurred in the UTRs and nonstructural region during HPeV15 evolution. The study strain has high similarity features with different variants of HPeV3 suggesting intertypic recombination. Our data contributes to the scarce data available on HPeVs in Africa and provides valuable information for future studies that aim to understand the evolutionary history, molecular epidemiology or biological and pathogenic properties of HPeV15.


Full-length genomic characterization and analysis.
The nucleotide (nt) sequence of NIG13194 was generated with an average coverage of 78X. De novo assembly and iterative mapping produced a genome scaffold covered by 15272 reads (0.04% of the total assembled reads).
The full-length sequence of NIG13194 consists of 7,298 nt in length excluding the 3′-poly (A) tail. The overall base composition of the study strain genome was 32.5% A, 20.4% G, 19.2% C and 27.9% U. NIG13194 contained a 5′untranslated region (UTR) of 680 nt, a single ORF of 6531 nt encoding a putative polyprotein precursor of 2177 aa, and a 3′UTR of 99 nt preceding the poly(a) tract. The polyprotein of NIG13194 comprised capsid proteins VP0 (289 aa), VP3 (256 aa), and VP1 (226 aa), and nonstructural proteins 2 A (149 aa), 2B (122 aa), 2 C (329 aa), 3 A (117 aa), 3B (20 aa), 3 C (200 aa), and 3D (469 aa). The NIG13194 strain was further characterized by comparing the nt sequence and deduced amino acid (aa) sequences in different regions of the genome with those of HPeV prototype strains available in the GenBank database (Table 1). In order to identify the type of the HPeV detected, the VP1 sequence of strain NIG13194 was compared with those of 19 HPeV reference strains. Based on the standard criteria for classification of parechoviruses (nt and aa sequence identity of VP1 ≥ 77% and ≥87%, respectively 14 ), strain NIG13194 was confirmed to be a HPeV15 as its complete VP1 coding sequence displayed 79.6% nt and 93.4% aa similarity with the HPeV15 prototype strain BAN-11614 (Table 1). Type HPeV15 was also confirmed with the RIVM typing tool (http://www.rivm.nl/mpf/enterovirus/typingtool/). Following alignment of the VP1 coding region of NIG13194 with HPeV15 prototype strain BAN-11614, nine deletions were observed at the 3′end of the VP1 (nt 649-651 and nt 666-671, numbered according to strain BAN-11614). Comparison of the complete ORFs of NIG13194 with other prototypes showed low nt identities (75.5 to 79.3%) and low aa identities (71.2 to 90.6%). The closest related HPeVs were HPeVs -3, -7 and -18 with nt identity of 79.3% and aa identity of 74.2%-90.6% (Table 1). Phylogenetic analysis. The dataset for our phylogenetic analysis of VP1 was conducted with ten HPeV15 VP1 sequences available in GenBank (including three complete and seven partial). All eleven HPeV15 strains clustered together and formed a clade supported by a high (100%) bootstrap value (Fig. 1). HPeV sequences formed monophyletic groups that corresponded to their type designation based on the VP1 region. The highest identity of the study strain was with strain A38 from Ghana (KY931650) (88.2% and 96.9% nt and aa identity, respectively). Overall, strain NIG13194 was more closely related to Ghanaian strains (mean p-distance of 0.118 SE ± 0.011) than to strains from Pakistan and Bangladesh (mean p-distance of 0.216 SE ± 0.018). Similarly, in terms of aa identity, NIG13194 was also more closely related to African strains (mean p-distance of 0.028 SE ± 0.009) than to Asian strains (mean p-distance of 0.053 SE ± 0.014). The average p-distance in the VP1 region of strain NIG13194 from other HPeV15 strains was 0,172 suggesting great genetic divergence among them. Phylogenetic analysis based on the P1, P2 and P3 coding regions of study strain NIG13194 and complete genome HPeVs strains available in GenBank (n = 134) was conducted to investigate their genetic relationships (Fig. 2). Consistent with the VP1 tree, in the phylogenetic analysis based on the P1 capsid coding region, the NIG13194 isolate clustered with strains of types -3, -17 and -18 (Fig. 2a). No sub-cluster was observed with HPeV15 strains because full-length sequences of the P1 region were not available for published HPeV15 strains. However, in the P2 and P3 region-based trees, strain NIG13194 grouped together with strains of other types sharing the highest similarity with HPeV4 strain TW00032 (83.3%) and HPeV2 strain 51Chzj674 (77.6%), respectively (Fig. 2b,c). These topological inconsistencies between the structural and nonstructural regions are suggestive of recombination during the evolution of HPeV15 or the other HPeV types.
www.nature.com/scientificreports www.nature.com/scientificreports/ Recombination analysis. To confirm the existence of recombination events, similarity plot and bootscanning analysis were conducted against sequences closely related to NIG13194. Given that no close relationship was observed between strain NIG13194 and other types based on homologous comparison of different genomic regions with HPeV prototypes (Table 1) or on the phylogenetic analysis in P2 and P3 coding regions (Fig. 2b,c), strains closely related with NIG13194 were screened against the NCBI non-redundant nt database using BLASTn (http://www.ncbi.nlm.nih.gov/). Non-coding regions (5′and 3′UTRs), structural P1 coding region (VP0, VP3 and VP1) and nonstructural P2 (2 A, 2B and 2 C) and P3 (3 A, 3B, 3 C and 3D) regions of NIG13194 were used as queries for BLASTn and sequences with highest homologies and complete genomes were used in the recombination analysis (Supplementary Table 1). A BLASTn search with the complete genome of NIG13194, the complete ORF and the P2 region showed that the highest sequence similarities (81.9%, 81.5% and 87.3%) were shared with an HPeV3 strain (651689) from the Netherlands (Supplementary Table 1). The similarity plot analysis indicate that the genome of strain NIG13194 displays a mosaic-like structure, suggesting that multiple genetic exchanges occurred through recombination events with common ancestors (Fig. 3a). The highest percent support values (>90%) were with different HPeV3 strains at the 5′UTR (HPeV3 strain BJC3174, China 2012), the 3′UTR (HPeV3 strains FEC21 and FEC23, Australia 2013) and the 2 C and 3 A coding regions (HPeV3 strain 651689, Netherlands 2006) which suggests that genetic exchanges might have occurred with type 3 strains. Bootscanning analysis confirmed the existence of recombination events between the genomic sequence of study HPeV9 www.nature.com/scientificreports www.nature.com/scientificreports/ strain NIG13194 and related viruses (Fig. 3b) and recognized two major potential breakpoints around nucleotide position 3050 in the junction between P1 and the P2 regions and another one located at the 3′-terminus of the 3 A region around nucleotide 5249 (positions referred to alignment).

Discussion
The present study reports the identification and complete genomic characterization of HPeV15 recovered from a patient in Niger. HPeV15 is a rarely reported HPeV type worldwide. From 2007 to 2010, a limited number of HPeV15 strains were reported from Bangladesh, Pakistan and Ghana 4,[9][10][11] . To the best of our knowledge, HPeV15 has not been reported from elsewhere to date despite continued HPeV surveillance in developed settings such as USA 15,16 or the Netherlands 17,18 nor in worldwide HPeV investigations [19][20][21][22] . There are two possible reasons for limited worldwide HPeV15 detection: (i) silent transmission due to HPeV underdiagnosis explained by the lack of HPeV routine testing in clinical settings and lack of awareness among clinicians 1 , or (ii) localized geographical circulation of the virus in parts of Africa and Asia. However, in Africa, none of the studies conducted between 2002 and 2016 for HPeV detection in stool samples detected HPeV15 strains, neither in healthy nor in children with gastrointestinal symptoms [23][24][25][26] , except for one study from Ghanaian children in 2007-2008 11 . Nevertheless, these studies highlight the high genetic diversity of HPeVs in Africa, which include rare types like HPeV16, HPeV17 or the new type HPeV18 11,23,25,26 . The studies that detected HPeVs in respiratory samples in African countries (Tunisia, Kenya and Gabon) did not perform type identification and therefore it is not possible to know whether type 15 was detected in these samples [27][28][29][30] . HPeV typing on respiratory samples would provide valuable information to understand the range of clinical presentations associated with different HPeV types and their genetic diversity. Pairwise and homologous comparison of study strain with VP1 sequences of other HPeV15 strains (African and Asian strains) revealed a high level of genetic diversity between strains. When compared with the HPeV15 prototype, a genetic divergence of more than 20% (79.6%) was observed. Considering the 77% identity demarcation criteria for HPeV types, it is reasonable to conclude that HPeV15 is not a new emerging virus, but one that has been circulating and evolving unrecognized for a long time.
In our study, HPeV15 was identified by using a cell culture system designed for EV diagnostics and a metagenomic approach. This finding differs from previous studies on HPeV15 where the virus was detected without isolation and directly from collected stool samples by HPeV real time reverse transcription PCR and where typing was done by amplifying and sequencing VP1 4,9,11 . We describe the efficient growth of HPeV15 in RD cells, a cell line included as part of the routine diagnostic algorithm implemented for Polio diagnosis. This is in agreement with previous studies describing HPeV types 1 and 4 growth in this cell line 5 . Remarkably, HPeV3, one of the most similar types to our study strain, cannot be cultured in RD cells 5 . Because HPeV has an indistinguishable CPE from EVs in RD cells 31,32 , HPeVs might be misdiagnosed when using conventional culture algorithm designed for EV diagnostics. This calls for implementation of HPeV molecular assays for direct detection of HPeVs in clinical specimens obtained from young children or the use of metagenomics. In this study, untargeted NGS has proved a useful tool when applied retrospectively to screen for coinfecting viruses that are not investigated in   www.nature.com/scientificreports www.nature.com/scientificreports/ routine laboratory work. In recent years, NGS has boosted identification of unsuspected and novel viruses and is becoming the standard for the discovery of viral pathogens 33,34 . It has been successfully used to identify or characterize full-genomes of new HPeVs like a HPeV1 variant 35 , type 4 31 , types 5 and 6 32 , type 7 36 , type 16 26 , type 17 37 , types 18 and 19 38 or to detect HPeVs in clinical specimens 25,26,[39][40][41] . Thanks to NGS, we were able to detect a mixed infection of HPeV15 and EV-A71 when using this technique to characterize the full genome sequence of EV-A71 13 . Several studies have reported HPeVs in mixed infections. Interestingly, two studies observed that HPeV15-positive cases had a higher chance of confections 10,11 . More studies have observed HPeV coinfections in patients with gastrointestinal or respiratory infections, although the clinical impact of HPeVs in these mixed infections is not clear 20,22,42 .
In our study, HPeV15 has been found in a patient with AFP. HPeVs are increasingly recognized to be associated with viral neurological pediatric disease 2 . Different case reports describe HPeV types 1, 3, 5, 6 and 12 detection in sporadic cases of paralysis in children <3 years of age 36,40,[43][44][45][46] . However, we cannot speculate that HPeV15 infection may correlate with AFP since only a few strains have been discovered worldwide and none of them was from AFP cases. In this study, AFP could be explained by the coinfecting EV-A71 (genus Enterovirus, family Picornaviridae) which has been associated with severe and sometimes fatal neurological diseases, including AFP, affecting mostly infants and children 47 . However, like HPeVs, detection of EVs in feces is not enough evidence to demonstrate the relationship with AFP since asymptomatic long-term excretion of these picornaviruses in stool is common in young children.
Many picornaviruses contain an arginine-glycine-aspartic acid (RGD) motif in the C-terminus of VP1 capsid protein known to play a role in host cell recognition through interactions with cell surface integrins 3 . A functional RGD motif present in HPeV types 1, 2, 4, 5, and 6, while absent in the other HPeV types including type 15, has been associated with infection of the CNS and neonatal sepsis 3,4,45 . In accordance to this, no such motif was detected in the HPeV15 strain found in this study, supporting the use of an alternative non-integrin receptor for entry into host cells although this remains to be determined 1 . It has been suggested that a different cell tropism conferred through the use of a different putative receptor may account for more severe neurological disease symptoms 1,48 . Further clinical studies are required to assess the role of receptor usage and pathogenesis in HPeV infection.
Like for other picornaviruses, one of the evolutionary mechanisms that shape HPeV genomic diversity is the accumulation of mutations, insertions or deletions 49 . When comparing the HPeV15 study strain to the prototype strain, a striking feature was the deletion of nine nt at the 3′ end of the VP1. The capsid protein VP1 is involved in virus binding to cell surface receptors. Whether this deletion in the VP1 protein of the HPeV15 study strain might affect pathogenicity through an altered interaction with host cell receptors warrants further investigation. Genetic recombination is also common process during evolution of HPeVs [48][49][50] . In agreement with previous studies 38,48,50 , the inconsistent phylogenies across the genome between provide evidence that recombination events have occurred in the evolutionary process of HPeV 49 . Moreover, similarity plot and boot scanning analysis with closely related sequences showed high (>90%) sequence similarity for worldwide distributed HPeV3 strains within the UTRs and the P2/P3 junction, suggesting potential recombination events in these regions. These results correlate with other studies that described the most frequent sites for HPeV recombination in the UTRs or nonstructural region 50,51 . However, despite the high similarity, we cannot infer that any of these HPeV3 strains are the exact recombination donor strain of these regions. Interestingly, the most similar strain in the P2/P3 junction, strain 651689, was identified in stool samples from children aged <5 years in the Netherlands (2006) 52 . This strain does not cluster anymore with other HPeV3 sequences when the nonstructural region is analysed 51 . The HPeV parental strain that recombined with this divergent Dutch strain is unknown, but it is plausible to hypothesize that our study strain and the divergent HPeV3 strain 651689 shared a recent common ancestor. Since these two strains belong to different types, this finding suggests potential intertypic recombination during HPeV15 evolution. Of note, it cannot be excluded recombination with HPeV types 9 to 13 as no whole-genomes sequences are currently available for these types. The fact that the countries of origin of viruses closely related to the HPeV15 strain are geographically far from Niger (China, Australia or The Netherlands, see Supplementary Table 1) and temporally distant from the study strain (Supplementary Table 1), supports the hypothesis that HPeV15 has been circulating globally throughout the years before a recombinant virus emerged.
The scarce number of published HPeV full-genome sequences limits interpretation of results for recombination analysis. More HPeV virus strains and complete genome sequences are needed to better monitor HPeV evolution, divergence and recombination events. This should be complemented with increased surveillance, implementation of molecular detection and typing assays in clinical settings and large-scale epidemiological studies of HPeVs to monitor novel types and further our understanding of their geographic distribution, circulation patterns, genetic diversity, type specific virulence and clinical relevance.

Materials and Methods
Ethics statement. The study did not involve human participants or experimentation, but the use of cell culture isolates of viruses recovered from stool samples of patients with AFP collected through routine poliomyelitis surveillance activities at the instigation of the World Health Organization (WHO) for public health purposes. All technical and ethical aspects were approved by WHO and the Ministry of Health of Niger. The protocol for routine AFP surveillance activities framed in the Global Polio Eradication Initiative were approved by the Ethics Committee of the WHO in compliance with applicable National regulations governing the protection of human subjects. The protocol was in accordance with the principles of the Helsinki Declaration. Informed consent was acquired from parents for the use of clinical samples at the time of sample collection.

Sample collection and virus isolation.
According to the WHO guidelines for poliomyelitis surveillance, two stool specimens from all AFP cases under 15 years of age should be collected 24-48 hours apart and within (2020) 10:6759 | https://doi.org/10.1038/s41598-020-63467-w www.nature.com/scientificreports www.nature.com/scientificreports/ one month of the onset of paralysis 53 . In this study, only one stool sample was collected on March 6 th and submitted on March 8 th , 2013. The sample was collected 8 days after the onset of paralysis (February 26 th ). Following the WHO guidelines for the virological investigation of poliomyelitis 54 , stool samples are processed and inoculated on two continuous cell lines (RD and L20B). For samples showing no cytopathic effect (CPE) after 5 days, a blind passage in the same cell line was done using the supernatant of the inoculated negative cultures and microscopically checked over the next 5 days. Viral cultures positive in RD cells were re-passaged in L20B cells and examined for 5 days to exclude the possibility that they were polioviruses. Only viral cultures producing CPE in RD cells and not in L20B cells were considered to contain NPEVs. Supernatant from RD cells showing a complete CPE was recovered and kept frozen (−20 °C) until typing.
Metagenomic sequencing. Extraction of RNA was performed as previously described 55 followed by treatment with Turbo DNase (Ambion) to digest contaminating DNA. Host rRNA were depleted from RNA samples using the NEBNext ® rRNA Depletion Kit (New England Biolabs) as described previously 56 . Importantly, samples were processed in a facility where no research on HPeVs has been performed. RNA from selective depletion was used for cDNA synthesis and Illumina library preparation using the Nextera XT kit with dual indexes, and sequenced on a NextSeq. 500 (75 cycles, paired-end reads; Illumina) platform.
Genome assembly. Raw paired-end files were processed for removal of Illumina adaptor sequences, trimmed and quality-based filtered using Trimmomatic software v0.36 57 . De novo assembly was performed using metaSPAdes with default parameters 58 . Contigs were queried against the ViPR database 59 retrieved in march 2019 using DIAMOND 60 . In addition to contigs matching EVA-71 (7345 (0.02%) reads mapped to this virus), whose sequence was previously obtained 13 , two large contigs corresponded to HePVs sequences. Using the sequence from the top DIAMOND hits (ADA79690, ANJ01603) as reference, HePVs contigs were assembled into a full-length HePV scaffold with missing sections (gaps). Iterative mapping with the mapper of clc-assembly-cell v5.1.0 (with length fraction set to 0.6) using the total reads on this gapped scaffold allowed to complete the missing sections and obtain a near complete genome (15272 (0.04%) reads mapped to HePV15). A final iteration of the mappings was performed with stringent parameters.
Phylogenetic analysis. Nt and aa sequence alignment was performed by using ClustalW multiple alignment program within the BioEdit Sequence Alignment Editor package, version 7.0.9.0. Maximum-likelihood (ML) phylogenies of HPeV strains were inferred using IQ-TREE, and branch support was calculated using ultrafast bootstrap approximation with 1000 replicates 61,62 . Prior to the tree reconstruction, the ModelFinder application 63 , as implemented in IQ-TREE, was used to select the best-fitted nt substitution model. Sequence divergence was determined by using MEGA5 to calculate mean pairwise distances within groups.
Recombination analysis. Similarity plot and bootscanning analysis were performed by using the SimPlot program, version 3.5.1, with a 400-nt window moving in 20-nt steps and using a Kimura two-parameter method with a transitions-transversions ratio of 2 with 1000 resampling.

Data availability
The nt sequence of the complete genome of the HPeV15 strain NIG13194 has been deposited in the GenBank database (Accession No. MN265386). The raw sequence files are available at NCBI under BioProject Accession No. PRJNA564393.