Genomic analyses of human adenoviruses unravel novel recombinant genotypes associated with severe infections in pediatric patients

Human adenoviruses (HAdVs) are highly contagious pathogens of clinical importance, especially among the pediatric population. Studies on comparative viral genomic analysis of cases associated with severe and mild infections due to HAdV are limited. Using whole-genome sequencing (WGS), we investigated whether there were any differences between circulating HAdV strains associated with severe infections (meningitis, sepsis, convulsion, sudden infant death syndrome, death, and hospitalization) and mild clinical presentations in pediatric patients hospitalized between the years 1998 and 2017 in a tertiary care hospital group in Bern, Switzerland covering a population base of approx. 2 million inhabitants. The HAdV species implicated in causing severe infections in this study included HAdV species C genotypes (HAdV1, HAdV2, and HAdV5). Clustering of the HAdV whole-genome sequences of the severe and mild cases did not show any differences except for one sample (isolated from a patient presenting with sepsis, meningitis, and hospitalization) that formed its own cluster with HAdV species C genotypes. This isolate showed intertypic recombination events involving four genotypes, had the highest homology to HAdV89 at complete genome level, but possessed the fiber gene of HAdV1, thereby representing a novel genotype of HAdV species C. The incidence of potential recombination events was higher in severe cases than in mild cases. Our findings confirm that recombination among HAdVs is important for molecular evolution and emergence of new strains. Therefore, further research on HAdVs, particularly among susceptible groups, is needed and continuous surveillance is required for public health preparedness including outbreak investigations.

HAdV6, HAdV57 and HAdV89), these are reported to be more clinically significant than other HAdV species in causing severe infections with life-threatening implications in young children and immunocompromised patients 3,13 . More than half of HAdV infections in young children are associated with HAdV1 and HAdV2 2, 10 . After primary infection, genotypes of HAdV-C species may establish latent infections and are capable of longterm persistence in lymphoid cells [14][15][16] . Thus, asymptomatic individuals can shed infectious viruses in stool for many years 17,18 .
HAdV genomes can be unstable as they are subjected to genetic drift resulting from base insertion, substitutions and deletions, but also are prone to changes observed as antigenic shifts originating from genomic recombination between at least two viral strains 19 . Early studies of HAdV recombination [20][21][22] led to the hypothesis that molecular evolution of HAdV strains may be driven by recombination 19 . Moreover, recent studies have shown that recombination among circulating HAdV strains is frequent and plays a critical role in shaping the phylogenetic relationships among HAdV genomes 23 . To date, no recombinant HAdV has been reported in Switzerland. The aim of our study was to determine if there were any genomic differences between circulating HAdV strains causing severe and mild clinical presentations in hospitalized pediatric patients in Bern, Switzerland. We investigated the phylogenomic relationships and potential recombination events among Swiss HAdV isolates based on their whole-genome sequences. Overall, our results document that recombination is a major factor for HAdV evolution and may possibly contribute to disease severity.

Results
Genomic characteristics and comparative analysis. The complete genomes of HAdV isolates from pediatric patients presenting with severe cases including meningitis, sepsis, convulsion, sudden infant death syndrome, death, and hospitalization were analyzed alongside those obtained from HAdV pediatric patients presenting with mild cases ( Table 1). The selection of mild cases for genomic comparative purposes was based on clustering of their partial hexon gene nucleotide sequences on the phylogenetic tree in comparison to partial hexon sequences isolated from severe cases. All patient isolates in this study were initially typed by Sanger sequencing of the hypervariable hexon region 1-7 10 . Molecular typing based on this region of a diagnostic specimen is sufficient to identify circulating HAdV genotypes causing infection among hospitalized patients. However, it does not allow for precise and accurate resolution among genotypes in particular identifying and assessing potential recombination events along with genome rearrangements. The HAdV genotypes analyzed in this study belonged to species C as these were the ones implicated in causing severe infection including death among Phylogenetic analysis. Phylogenetic analysis was performed on whole-genome sequences and on complete sequences of penton base, hexon and fiber genes to investigate the genetic relationships between the Swiss HAdV strains and the prototype HAdV strains obtained from GenBank ( Fig. 1). Whole-genome phylogenetic analysis of the eight severe and six mild cases did not show any differences except for ADVJA-749-BE (isolated from a patient presenting with sepsis, meningitis and was hospitalized) that formed a separate cluster among other HAdV species C genotypes (Fig. 1A). Overall, clustering of the complete hexon and fiber gene sequences agreed with clustering of the complete genomic sequences except for isolate ADVJA-749-BE with intertypic recombination affecting the hexon and fiber gene (Fig. 1A,C,D). Phylogenetic analyses of the penton base gene showed a completely different clustering with relatively low bootstrap support values (Fig. 1B). This suggests high identity within the penton base gene between genotypes of the same HAdV species. Phylogenetic analysis of the three HAdV coding regions (penton base, hexon, and fiber) demonstrated that ADVJA-749-BE strain exhibited a close relationship to HAdV89 in its penton base, and hexon gene, and to HAdV1 in its fiber gene ( Fig. 1B-D, respectively).
Comparative genomic analyses of a potentially "novel" HAdV-C genotype. To determine the genomic characterization of ADVJA-749-BE strain, comparison with all prototype strains of HAdV species C was performed. Compared with the complete HAdV species C sequences of the 6 prototype strains of HAdV1 (AC_000017.1), HAdV2 (AC_000007.1), HAdV5 (AC_000008), HAdV6 (FJ349096), HAdV57 (HQ003817), and HAdV89 (MH121097), ADVJA-749-BE strain shares the highest nucleotide similarity (97.37%) at the full genome level with the prototype strain HAdV89 (Table 3). Highest similarity with this prototype strain was also observed within the penton base (98.31%), hexon (98.01%), E1B 55K (99.73%) and the E3 ORF (99.58%). The latter region of ADVJA-749-BE also had the same nucleotide identity (99.58%) with the prototype strain of HAdV2 and HAdV6. The prototype strains HAdV2 and HAdV6 also showed the greatest similarities to ADVJA-749-BE in the E1A (99.89%). Furthermore, prototype strain HAdV2 showed the highest similarities with ADVJA-749-BE within the DBP (99.06%) and 100K (99.63%) genes. Within the DNA polymerase gene, ADVJA-749-BE shared highest similarity (99.67%) to HAdV6. On the other hand, the ADVJA-749-BE strain shared higher identities with HAdV1 strain in the fiber region (Table 3). A high level of similarity and identity across the genomes of HAdV species C was observed (Fig. 2). The results obtained complement the phylogenetic analysis demonstrating that the ADVJA-749-BE strain is highly similar to the prototype strains of HAdV species C but with differences across the entire genome particularly showing diversity in the hexon, E3 and fiber (Fig. 2).

Genomic recombination analysis. To investigate potential recombination events within and between
HAdV genomes isolated in this study, particularly the genome sequences isolated from HAdV severe cases, we performed Bootscan analysis with studied Swiss strains and representative HAdV species C prototype strains available in GenBank (Fig. 3). Of note was isolate ADVJA-749-BE which was associated with multiple recombination events. The genome sequence of ADVJA-749-BE that was isolated from a pediatric patient presenting with sepsis, meningitis and hospitalized was identified as a possible recombinant that may have arisen from recombination events involving HAdV1, HAdV2, HAdV5, and HAdV89. Unlike this potentially new recombinant isolate ADVJA-749-BE, which has HAdV5 penton base, HAdV2/HAdV8 hexon and HAdV1 fiber, the hexon and fiber sequences of HAdV2 and HAdV89 were similar. The other samples from severe and mild cases also showed some potential recombination events, but to a lesser extent (Fig. 3, Supplementary Fig. S1). For severe infections due to HAdV species C, 75% (6 out of 8 cases) had potential recombination events involving at www.nature.com/scientificreports/ least two of the representative HAdV species C prototype strains, while 66.7% (4 out of 6 cases) of the mild infections had potential recombination events involving also at least two of the HAdV species C prototype strains.
Amino acid analysis of the potentially "novel" HAdV-C genotype with parent prototype strains. To assess the overall sequence similarity of the novel HAdV C (ADVJA-759-BE) with its candidate parent HAdV C prototype sequence strains (HAdV5 penton, HAdV2/89 hexon and HAdV1 fiber), an amino acid alignment was performed (Fig. 4). Compared with the HAdV5 prototype strain (AC_000008), the novel HAdV C (ADVJA-759-BE) had 98.8% identical sites, a pairwise identity of 98.8% with the complete penton base, two amino acid substitutions (L152S and P153L) in the hypervariable region 1 (HVR1) of the penton base, four amino acid substitutions (S310G, V331A, D342E and A363E) and one amino acid deletion (P364) in the RGD loop of the penton base (Fig. 4A). Although the amino acid substitution (A363E) and deletion (P364) were shown to be unique to HAdV89 by Dhingra et al. 25 , we found that the novel HAdV C (ADVJA-759-BE) also has the substitution (A363E) and deletion (P364) in the RGD loop of the penton base. Compared with other HAdV C prototype sequences strains, ADVJA-759-BE had one unique amino acid substitutions (V331A) in the RGD . Eight severe cases (highlighted in purple, of those who died indicated with a red circle), six mild cases (highlighted in green), and GenBank sequences of the prototype strains are indicated in bold black font. Labelling indicates isolate names or accession number (for prototype strains), followed by HAdV genotype/species classification. Alignment positions relative to prototype sequence AC_000007 are indicated for each major HAdV gene (penton base, hexon, and fiber). Phylogenetic trees were generated using the Maximum Likelihood method based on Jukes-Cantor model 24 with 1000 bootstrap replicates. The percentage of trees in which the associated taxa clustered together is shown next to the branches. The tree is drawn to scale, with branch lengths proportional to the number of substitutions per site. www.nature.com/scientificreports/ loop and one unique substitution (D342E) in the RGD motif region of the penton base (Fig. 4B). Interaction of the penton base RGD motifs with cellular integrins facilitates virus internalization 26 . There were 99.2% identical protein sites among HAdV2, HAdV89 and the novel HAdV C (ADVJA-759-BE) and pairwise identity of 99.4%. Compared with the HAdV2 prototype strain (AC_000007.1), the novel HAdV C (ADVJA-759-BE) had one amino acid addition (149E), one amino acid substitution (E200G) in the HVR1-6 region of the hexon and three amino acid substitutions (G448D, S449A and D454N) in the HVR7 region of the hexon. Compared with the HAdV89 prototype strain (MH121097), the novel HAdV C (ADVJA-759-BE) had two amino acid substitutions (E200G and L306M) in the HVR1-6 region of the hexon protein and one substitution (G448D) in the HVR7 region of the hexon protein (Fig. S2). Compared with the HAdV1 prototype strain (AC_000017.1), the novel HAdV C (ADVJA-759-BE) had 98.8% identical sites and 98.8% pairwise identity with the complete penton base and seven amino acid substitutions (K74E, N199S, H414Y, R442K, R510T, T563S and I565M) in the fiber protein (Fig. S3).

Discussion
We conducted a comparative genomic analysis using whole-genome sequences of HAdV species C from hospitalized pediatric patients in Bern, Switzerland. Among the six genotypes of HAdV species C, HAdV1 and HAdV2 were previously reported to cause higher morbidity than other genotypes among the pediatric patients hospitalized between the year 1998 and 2017 in a tertiary care hospital group in Bern, Switzerland covering a population base of 2 million inhabitants 10 . A trend toward more severe cases due to HAdV species C has also been reported by Esposito et al. 27 . To date, no recombinant HAdV has been reported or identified in Switzerland,  www.nature.com/scientificreports/ mainly because of a lack of studies on HAdV epidemiology and molecular evolution. Typing systems based on the partial gene sequence of one or more of the three major capsid genes (penton base, hexon, and fiber) Figure 3. Bootscan analysis of the whole-genome sequences for severe HAdV isolates compared with sequences of prototype HAdV1, HAdV2, HAdV5, HAdV6, HAdV57, and HAdV89. Bootscan of whole-genome sequence from patient presenting with sepsis, meningitis and was hospitalized due to HAdV species C genotype (A), meningitis due to HAdV1 (B), sudden infant death syndrome due to HAdV1 (C), gastroenteritis and hospitalization due to HAdV2 (D), death resulting from HAdV2 (E), sudden infant death syndrome due to HAdV2 (F), fever, convulsion, and hospitalization due to HAdV2 (G), and sudden infant death syndrome due to HAdV5 (H). Analyses were performed using SimPlot (see "Methods" section). The genotypes involved in recombination events for each of the severe cases are indicated on each panel. The black bar at the top represents the genome map with black arrows indicating approximate position of the coding transcripts and their direction. The legend shows the representative prototype HAdV species C strains used for comparison with the labelling as accession number-HAdV genotype. The percentage of permutated trees that supported grouping are marked along the y-axis and the genome nucleotide position are indicated along the x-axis. Parameter setting for the recombination analysis using Bootscan in the Simplot software were: window size (5000 nucleotides), step size (100 nucleotides), replicates used (n = 100), gap stripping (on), distance model (Kimura) and tree model (Neighbor-joining).  25,28,29 and may have consequences for HAdV detection and pathogenicity, information obtained on single gene sequences may not provide enough molecular resolution. Therefore, whole-genome sequencing based methods are recommended to investigate adenovirus phylogenomic relationships. By employing whole-genome sequencing approach, fourteen HAdV-C whole-genome sequences (eight from severe cases and six from mild cases) were generated from hospitalized pediatric patients. The sequences were subjected to phylogenetic analyses and probed for potential recombination events and genome rearrangements, which are recognized as important mechanisms that may influence tissue tropism, and potentially pathogenicity and virulence of novel HAdV pathogens 25,29,30 . Global pairwise genome comparisons were also performed (Fig. 2). Our results demonstrated that the phylogenetic analysis of the complete genomes, hexon and the fiber genes in this study provided similar information regarding clustering of the HAdV strains except for one strain (ADVJA-749-BE) showing intertypic recombination affecting the hexon and fiber genes. The penton base phylogeny, however, failed to provide much meaningful information on HAdV species C strains as the 14 sequences did not show much divergence between each other (Fig. 1B), a result supported by the findings of Zhang and Huang 31 . Moreover, a study by Robinson et al. found that HAdV species C and E have similar penton base genes but show diversity in the hexon, fiber, and E3 ORFs 28 , which is also observed in our study (Fig. 2).
Of particular note was the identification of a strain (ADVJA-749-BE) that represents a potentially novel HAdV species C genotype isolated from a child presenting with sepsis, meningitis, and hospitalization, thus may be an etiological agent associated with sepsis and meningitis. Although the initial typing results based on the partial hexon gene identified HAdV2 as the cause of infection, phylogenomic analysis indicated that this strain formed a separate cluster among the other genotypes of HAdV species C (Fig. 1A), sharing the highest nucleotide identity (97.4% and 97.3%) with HAdV89 and HAdV2, respectively at the genome level (Table 3), but possessed fiber gene sequence identical to that of HAdV1 (Figs. 1D, 2, Table 3). Moreover, recombination analysis further confirmed that the ADVJA-749-BE strain was a recombinant of HAdV1, HAdV2, HAdV5, and HAdV89 (Fig. 3A). In addition, the amino acid sequence alignment of the novel HAdV-C versus the suggested parent (HAdV5 penton base, HAdV2/HAdV89 hexon and HAdV1 fiber) amino acid sequences showed a high similarity (Fig. 4) Novel HAdV pathogens are well-known to arise from recombination events occurring only between HAdV genotypes of the same species and in regions of high sequence homology 20,21 . We therefore propose that ADVJA-749-BE isolate is a potentially novel genotype of HAdV species C, which may be an etiological agent associated with sepsis. Of the HAdV species C genotypes, HAdV57 (isolated from stool of a healthy child in 2001) 32 and HAdV89 (identified from stool of an immunosuppressed patient in 2015) 25 were both identified as recombinant viruses, with HAdV57 having a similar fiber gene to HAdV6 and harboring a unique hexon distinguished by its loop-2 motif 32 , whilst HAdV89 had a novel penton base sequence 25 . The circulation of recombinant HAdV species C strains has also been recently reported 31 . www.nature.com/scientificreports/ For natural recombination to occur, co-infection is required. Co-infection by two or more genotypes of HAdV species has been documented 33 . A study by Lukashev et al., in which they analyzed 16 HAdV species C field strains at four genomic regions including the hexon, fiber, polymerase and E1A regions, suggested that recombination is frequent 23 . This finding was also supported by a recent study by Dhingra et al. which demonstrated that potential multiple recombination events within the E1 and E4 gene regions are likely to contribute to the evolution of species HAdV-C 25 .
Overall, recombination and genomic rearrangements within the three major HAdV capsid genes (penton base, hexon, fiber) that are important determinants of tropism, as well as the E3 region that harbors genes affecting the host immunity after virus infection 34 may contribute to disease severity. Nevertheless, further work is needed to verify factors contributing to HAdV disease severity so as to better understand ways of developing effective preventive and therapeutic measures.

Methods
Samples. This study was based on already stored material at the Institute for Infectious Diseases (IFIK).
There was no intervention or interference with patient management, as clinical samples were already obtained routinely at IFIK for viral diagnostics. Decision to send samples to IFIK remained at the sole discretion of the physician in charge of the patients.
We defined severe cases retrospectively as those associated with death or requiring hospitalization. HAdV isolates from patients presenting with severe cases and a subset of selected mild cases were obtained from IFIK clinical bio-bank. Selection criteria of HAdV isolates associated with mild cases was based on HAdV hexon sequences that phylogenetically clustered in the same as or different branches than the HAdV severe cases (Fig. 1), as determined previously 10 . This study was approved by the Swiss Ethics Committees on Research involving humans (BASEC-Nr:Req-2018-00158) which waived the requirement to provide informed consent given that the study was retrospective, based on already stored material obtained for the same, original viral diagnostic purposes, and did not interfere with patient management or treatment. The study was conducted in accordance with the present protocol, the current version of the "Declaration of Helsinki", the "Good Clinical Practice (GCP)" Guidelines, Swiss law and the requirements of the competent authorities and the Ethics Committee.

Cell lines and virus DNA isolation.
Cell cultures were performed in A549 cells purchased from the American Type Culture Collection (Manassas, VA, USA) and were maintained in Earle's minimal essential medium (MEM) (Biochrom GmbH, Germany) supplemented with 2.2 g/l NaHCO 3 (Biochrom), 1% l-glutamine (Merck, Germany), 1% penicillin (10,000 U/ml) and streptomycin (10,000 μg/ml) (Biochrom), 1% fungizol (CPS Cito Pharma Services, Switzerland), and 1% heat inactivated Fetal Bovine Serum (FBS) (Biochrom) at 37 °C and 5% CO 2 . The samples inoculated in A549 cells were incubated at 37 °C for 3-7 days or until a cytopathic effect was observed. After cytopathic effect was confirmed, the positive cell culture supernatant was subjected to DNA extraction. DNA viral extraction from 200 µl supernatant of the cell cultured HAdV positive samples was automatically performed using the NUCLISENS easyMAG (bioMérieux, Geneva, Switzerland) extractor, as per manufacturer's instructions. After DNA extraction, the DNA concentration of each sample was measured using the Qubit dsDNA high sensitivity assay kit on the Qubit 3.0 Fluorometer (ThermoFisher Scientific, Zug, Switzerland) as per manufacturer's protocol prior to whole genome amplification (WGA).
Whole-genome amplification (WGA), preparation of sequencing libraries. The Seqplex enhanced DNA amplification kit (Sigma-Aldrich Chemie GmbH, Buchs SG, Switzerland) was used for WGA of the DNA as per manufacture's protocol. The DNA was sheared to 400 bp using the Covaris M220 system (Covaris Ltd, Brighton, United Kingdom) prior to WGA amplification. The Covaris program for shearing was as follows; temperature 20 °C, duty factor 20, cycles/burst 200, peak power 50, and time 60 s. Libraries were constructed with the Ion plus fragment library kit using the ABI library builder according to the manufacturer's instructions (ThermoFisher Scientific).
Sequencing barcoded libraries were prepared automatically on the ABI Library Builder System with the Ion plus Fragment Library Kit (ThermoFisher Scientific) according to the Ion Xpress Plus and Ion Plus Library preparation (ThermoFisher Scientific). The generated adapter-ligated libraries were subjected to size selection with 0.55× Agencourt AMPure XP magnetic beads (Beckman Coulter, Nyon, Switzerland), followed by measurement of the concentration using Qubit dsDNA High Sensitivity Assay kit (ThermoFisher Scientific) and assessment of the size distribution using the Agilent 2100 Bioanalyzer system with the Agilent High Sensitivity DNA kit (Agilent Technologies AG, Basel, Switzerland). Sample libraries were pooled and loaded automatically on the 530-chip using the Ion Chef instrument according to the manufacturer's instructions. The loaded chip was then inserted into the Ion S5XL for sequencing with 850 flows using the Ion 530 (400 bp) chip kit.
Bioinformatic analyses. Raw sequence data in BAM file format were imported into CLC genomics workbench v12.0.3 (QIAGEN, Aarhus, Denmark) and trimmed with the following parameters: Removal of adaptor/ barcode sequences from both ends, discarding of reads with ambiguous bases, base quality below Q30, and read length below 50 nt. The trimmed reads were exported as fasta file from CLC genomics workbench and imported into Geneious Prime 2020.1.2 software (http:// www. genei ous. com 35 ) for further analysis: The trimmed reads were normalized to 100× coverage using BBNorm (version 38.37) and duplicate reads were removed using Dedupe (version 38.37; k-mer seed length of 31). Reads were subjected to de novo assembly using SPAdes assembler (version 3.13.0) using default parameters. As the genotypes of most samples were previously identified based on Sanger sequencing of the partial hexon gene, the resulting contigs for each sample were mapped to HAdV prototype strain sequences of either HAdV1 (AC-000017), HAdV2 (AC-000007) or HAdV5 (AC-Scientific Reports | (2021) 11:24038 | https://doi.org/10.1038/s41598-021-03445-y www.nature.com/scientificreports/ 000008) to which it belonged using Bowtie2 (version 2.3.0), with "end-to-end" alignment and default parameters. Consensus sequences were produced and visually inspected. Following visual inspection of consensus sequences, gaps indicating missing nucleotides were identified and corrected by Sanger sequencing. Annotation of the consensus sequences was performed based on HAdV reference prototype strain genome annotations, using the Annotate & Predict tool within Geneious prime 35 . The annotations were manually checked and edited. The genome sequences for all 14 isolates were deposited to the European Nucleotide Archive, under project reference PRJEB40708.