Fatal Community-acquired Pneumonia in Children Caused by Re-emergent Human Adenovirus 7d Associated with Higher Severity of Illness and Fatality Rate

Human adenoviruses (HAdVs) are highly contagious pathogens causing acute respiratory disease (ARD), such as community-acquired pneumonia. HAdV-7d, a re-emergent genomic variant, has been recently reported in Asia and the United States after a several-decade absence. However, whether HAdV-7d is associated with higher severity than other types is currently unclear. In this study, the clinical and epidemiological investigation showed that fever, cough, and sore throat were the three most common respiratory symptoms of HAdV infections. HAdV-7 caused longer duration of fever, higher morbidity of tachypnea/dyspnea, pleural effusion, diarrhea, hepatosplenomegaly, consciousness alteration, as well as higher rates of pneumonia, mechanical ventilation and higher fatality rate (28.6%) than other types, particularly HAdV-3 and HAdV-2. The genomes of seven HAdV-7d isolates from mild, severe, and fatal cases were sequenced and highly similar with each other. Surprisingly, two isolates (2011, 2012) had 100% identical genomes with an earlier strain from a fatal ARD outbreak in China (2009), which elucidates the virus origin and confirms the unexpected HAdV genomic conservation and stability. Phylogenetic analysis indicated that L1 52/55-kDa DNA packaging protein may be associated with the higher severity of illness and fatality rate of HAdV-7. Clinicians need to be aware of HAdVs in children with ARD.

Comparative genomic analysis of the seven HAdV-7 isolates associated with severe and mild ARD. In order to better understand the association of the genetic characteristics of HAdV-7 and clinical manifestations and outcomes between children with severe ARD and mild respiratory disease, seven representative HAdV-7 isolates, including one from an outpatient (OP) causing mild respiratory disease (OP01_2011), three from inpatients (IP; fatal outcome) named IP01_2010, IP02_2011 and IP03_2011, two from severe cases (IP04_2011 and IP05_2012) (required intensive care and mechanical ventilation), and one from an IP with comparatively mild ARD (IP06_2012) (diagnosed as bronchitis; discharged after 4-day hospitalization), were selected to perform next-generation whole-genome sequencing on Ion Torrent PGM TM using 318 v2 chips; it yielded 6,063,042 reads with an average length of 182 bp, which was obtained with a mean coverage of 2,409x. The assembly in each samples yielded 3 or 4 contigs after PASA analysis. All of these samples have one long contig covering above half of the reference genome (N50 > 17 kb). The local coverage rates varied between 91-92%. The numbers of gaps identified in each samples were 3-5. These gaps were covered by Sanger sequencing for more than three times (84 in total) following PCR amplification. The complete genome sequences of strains OP01_2011, IP01_2010, IP02_2011, IP03_2011, IP04_2011, IP05_2012 and IP06_2012 were deposited in the GenBank database under accession nos. KP670857, KP670858, KP670856, KP670855, KP670861, KP670860, and KP670859, respectively.
Sequence analysis demonstrated these seven isolates had identical hexon, fiber and penton base genes. The further genome pairwise alignments indicated that the genome sequences of the 2 isolates (IP04_2011 and IP06_2012) were 100% identical with each other as well as with the previous HAdV-7 strain 0901HZ/ShX/2009, which was associated with the fatal pneumonia in an ARD outbreak in the Shaanxi Province of China in 2009 7,15 . The genome sequence identities between both isolates and the other five isolates were extremely high (more than 99.98%). When compared with strain 0901HZ/ShX/2009, the genomes of both isolates OP01_2011 and IP05_2012 contained only one single T insertion in E3 non-coding region and Virus-Associated (VA) RNA II, respectively (Table 6). Isolate IP01_2010 contained four nucleotide mutations (G to A) in membrane protein E3 RID-alpha, which led to four non-synonymous substitutions in this protein. There was only one synonymous Laboratory data and no. patients tested All HAdV-7 (n = 14) The other types (n = 54) p value *WBC, × 10 9 cells/L, n = 68 11.69 6.85 (1.  substitution (C to T) in hexon assembly-associated protein of isolates IP02_2011. Isolate IP03_2011 had one more non-synonymous substitution located in 34 kDa protein (G to T).
Genome type determination of the seven HAdV-7 isolates. Although seemingly antiquated when compared with genome sequencing, REA profiles are still helpful to compare the unsequenced but previously reported genome types and strains 1 . The genome types of all seven HAdV-7 isolates were determined by comparing REA profiles from both in silico (Fig. 3A) and agarose gel electrophoresis (Fig. 3B) with other HAdV-7 genome types reported earlier 1 . Isolates OP01_2011, IP01_2010, IP02_2011, and IP05_2012 were chosen as representatives for in silico REA (Fig. 3A); IP01_2010 and IP06_2012 were chosen for agarose gel electrophoresis REA (Fig. 3B). The REA profiles of HAdV-7 prototype strain Gomen and HAdV-7d strain DG01_2011 were chosen as references. According to the genome type denomination of Li, et al. 33 , all of these seven strains are identified as HAdV-7d (Fig. 3), evidenced by the REA patterns and identical with the first reported HAdV-7d [33][34][35] , as well as with the re-emergent HAdV-7d strains DG01_2011 and 0901HZ/ShX/2009 1,7,15 .
Phylogenetic analysis of hexon, fiber, penton base genes, L1 52/55 kDa DNA packaging protein gene and the whole genomes. Phylogenetic analysis of the hexon, penton base, and fiber gene of China isolates showed no substantial differences from those found in the Japanese, Korean and US isolates (bootstrap values less than 80) (Fig. 4A, 4B, 4C). Phylogenetic analysis of fiber gene showed that this gene was highly conserved in all the HAdV-7 isolates analyzed (Fig. 4C). However, the phylogenetic tree based on L1 52/55 kDa DNA packaging protein indicated that the isolates circulating in China, USA, and Argentina were much closer to HAdV-16, and distinct from the HAdV-7 prototype Gomen strain (1952) (Fig. 4D). Additionally, phylogenetic analysis of 31 available HAdV-7 whole genomes revealed that the seven HAdV-7d isolates represented an unusual clade (bootstrap value: 100) along with the other HAdV-7d strains circulating in China, including 0901/HZ/ ShX/2009, CQ1198_2010, DG01_2011 (Fig. 4E), which was distinct from the HAdV-7d2 USA strains and the prototype Gomen strain (Fig. 4A, 4B, 4D). The overall genome sequences of these hyper-virulent isolates are very distinct.

Discussion
CAP is a common and serious infection that afflicts children throughout the world 36 . It is the world's most important cause of child death 37 . In the United States, the annual incidence of pneumonia was 15.7 cases per 10,000 children, with the highest rate among children younger than 2 years of age (62.2 cases per 10,000 children) 38 .
In the developing world, the annual incidence is higher and CAP is also more severe and is the largest killer of children [39][40][41] . Respiratory syncytial virus, HAdV and human metapneumovirus were more common among children younger than 5 years of age than among older children 38 . Pediatric ARD, especially CAP, caused by HAdV infection has raised the public health concerns worldwide due to its high morbidity and mortality, especially in immunocompromised population. However, few large-scale epidemiologic and clinical data nationwide is currently available. In our study, the HAdV infection rate was 5.09% in pediatric inpatients and outpatients in Southern China, whereas it was 12%-20.1% in hospitalized children in China 42,43 . The type distributions of HAdV in Southern and Northern China are completely different. In Southern China, most of HAdV-positive cases were caused by HAdV-3 (61.6%), followed by HAdV-7 (15.2%), which was similar with the previous study 43 . However, HAdV-7 (46.2%) dominated in Northern China 42 . In Southern China, the peak season of HAdV infection rates was summer, with the exception of 2010. This was consistent with the study in hospitalized children in Southern China during 2012-2013 43 , but it was different from Northern China (winter) 42 .

L4
Hexon assembly-associated protein 24298 - Remarkably, we found HAdV-14 emerged in September 2010, one month earlier than the first HAdV-14 strain reported in China 21 . The relationship between the two strains merits further investigation. Another study in Northern China found that HAdV-55 had higher pneumonia severity index scores in adolescent and adult patients compared with those with other types 3 . However, due to the difference both in case number of each type and in the age groups, the severity of illness caused by HAdV-7 and -55 cannot be compared between their study and ours. Moreover, a high rate of HAdV co-infections (36.8%) with other respiratory viruses was also found. The dominant pathogens co-infected with HAdV were RSV, HBoV, HPIV and HCoV, which was similar to a report from Northern China 42 . However, no statistically significant difference was found in terms of clinical manifestations and laboratory data between HAdV co-infection and HAdV single infection.
HAdV-7d was firstly identified in 1980 in Beijing and rapidly became the major genome type circulating in China through 1990 45 . In 2009 and 2011, HAdV-7d re-emerged in China after an approximately 20-year absence and caused two ARD outbreaks in children with a fatality 1,7 . It was also the prevalent genome type in Japan during 1987 to 1992 46 and in Korea during two outbreaks in 1995-1996 and 2001-2002 47 , with a high fatality rate (18%) in children 48 . Recently, it emerged to cause adult severe CAP in the USA during 2013-2014 27,32 . In our study, HAdV-7d caused more severe clinical manifestations and adverse consequences than other types, particularly HAdV-3 and HAdV-2 (61 and 12 cases, respectively), with longer duration of fever, higher morbidity of tachypnea/dyspnea, pleural effusion, diarrhea, hepatosplenomegaly, and consciousness alteration, as well as with higher rates of pneumonia, mechanical ventilation and death. As there were relatively few cases of HAdV-14 and HAdV-55 infection in our samples, whether HAdV-7d can cause more severe clinical manifestations or adverse consequences than these latter two types is unknown. More cases will be collected for further analysis.
In our study, whole-genome sequencing was carried out on Ion Torrent PGM TM using 318 v2 chips, which yielded 6,063,042 reads with an average length of approximately 182 bp as well as a mean coverage of 2,409x. Gaps and ambiguous sequences were PCR-amplified using different primers and re-sequenced by Sanger method for more than three-fold coverage. The genome ends were determined by direct end sequencing using genomic DNA as template. Given that Ion Torrent could not generate reads at long (> 14-base) homopolymer tracts, and cannot predict the correct number of bases in homopolymers > 8 bases long 49 , there may be a possibility of sequencing error in these regions. Additionally, the reported sequencing error rate for Ion Torrent was 1.78%. The number of error-free reads, without a single mismatch or indel, was 15.92% 49 .
We believe that genetic differences in the HAdV-7 genomes played an important role in clinical consequences of pediatric ARD with HAdV-7d causing more severe illness. Unexpectedly, we observed that the complete genome sequences of HAdV-7d isolates from mild, severe and fatal respiratory diseases were almost indistinguishable. Also surprisingly, strains IP04_2011 (severe case) and IP06_2012 (mild case) were 100% identical with each other as well as with the fatal strain (0901/HZ/ShX/2009) from the 2009 ARD outbreak 7,15 ; the other five strains had only a few nucleotide mutations. To our knowledge, this is the first report that the complete genomes were chosen to be analyzed. All the seven HAdV-7 isolates in this study have identical REA profiles with each other, different from the Gomen prototype, identified as genome type HAdV-7d. M, molecular weight markers corresponding to a 500 bp DNA Ladder (A) and DL10,000 DNA Marker (B) respectively. Nota bene: the agarose gel electrophoresis of RE digested genomic DNA was not run at the same time but under the same conditions. of three isolates from different periods were identical with each other, which confirms the expected degree of genomic sequence conservation and stability of HAdVs. Based on the genome sequence identity, we conclude that strains IP04_2011 and IP06_2012 in Guangzhou originated from strain 0901/HZ/ShX/2009 in Xixiang County, Shaanxi Province, although both cities are more than 1,000 kilometers apart. The unexpectedly high genome identities between isolates from mild, severe and fatal cases indicate that the difference in HAdV-7d pathogenicity may be due to host immunity. The detailed risk factors associated with adverse consequences caused by HAdV-7d merit further investigation.
Recombination analysis revealed that the genome variant 7d differs from the 1950s-era prototype strain by a lateral gene transfer, substituting the coding region for the L1 52/55 kDa DNA packaging protein from HAdV-16 1 . Our phylogenetic analysis confirmed this recombination even in HAdV-7b, 7d, 7d2, and 7 h (Fig. 4D), while not in the prototype Gomen strain. The importance of this recombination remains unknown. However, a previous study found that this protein was expressed in both the early and late stages of infection 50 . L1 52/55 kDa proteins and IVa2 interacted in infected cells and bound in vivo to the packaging sequence 50,51 . It was an essential protein absolutely required for DNA packaging and encapsidation 52 , which contributed to the adenoviral reproduction. All the emergent pathogens (7b, 7d, 7d2, and 7h) contained this particular moderate sized recombination from HAdV-B16 into the HAdV-7 genome chassis. These data provide a clue that the L1 52/55 kDa DNA packaging protein may be associated with the HAdV-7 hypervirulent phenotype, but it requires confirmation. . Phylogenetic analysis of the HAdV-7 isolates associated with mild, severe and fatal respiratory diseases. Panels A, B, C, D and E represent the phylogenetic trees of hexon, penton genes, fiber, L1 52/55 kDa DNA packaging protein, and the whole genomes of HAdV-7, respectively. The neighbor-joining trees with 1,000 replicates were constructed using the MEGA 6.06 software and by applying default parameters, with a maximum-composite-likelihood method. The HAdV-7d isolates with whole genomes sequenced in this study were highlighted with triangles, including the strains from fatal cases indicated in filled triangles. The archived HAdV-7 genome sequences, hexon, penton, fiber, and 52/55 kDa sequences with GenBank Accession Numbers, country of isolation, strain name, year of isolated (if available), and genome type (if available) were included in this phylogenetic analysis.
Scientific RepoRts | 6:37216 | DOI: 10.1038/srep37216 In conclusion, this study identified the circulating HAdV types associated with pediatric ARD in Southern China along with clinical features and the whole-genome sequencing analysis. In Southern China, the predominant types in children were HAdV-3 and HAdV-7. The peak season of HAdV infection in 2011-2012 was summer. The re-emergent HAdV-7d isolates originated from the earlier strain associated with a fatal ARD outbreak (2009). HAdV-7d caused more severe illness and higher fatality rate in pediatric inpatients than other types. The L1 52/55 kDa DNA packaging protein may be associated with the HAdV-7 hypervirulent phenotype. Fever, cough, and sore throat were the three most common respiratory symptoms of severe HAdV infections. The vaccine against HAdV-7 is in urgent need in both China and other population-dense countries. HAdV-7d associated with higher severity of illness has re-emerged in China and may have the potential to cause more outbreaks. Clinicians should pay particular attention to HAdVs in children with ARD. We call attention to urge extensive and continuous epidemiological surveillance and genetic characterization of the currently circulating HAdV strains in China, as well as in other countries with close-quartered and/or vulnerable populations.

Methods
Study design and clinical specimens. This study was carried out in the pediatric department of Zhujiang Hospital, Southern Medical University. The pediatric outpatient and inpatient cases under 14 years with influenza-like symptoms and clinically confirmed ARD from 2010 to 2012 were included in this study. Nasopharyngeal swab specimens from these patients were collected. All the experimental protocols in this study were approved by the Medical Ethics Board of Southern Medical University and were carried out in accordance with the approved guidelines. The informed consents for participation in this study were obtained from guardians of all the underaged participants. Data records of the samples, sample collection and analysis are de-identified and completely anonymous.

Microbiological diagnostic tests. Human adenovirus (HAdV), influenza virus A and B (Flu A & B),
human parainfluenza virus (HPIV) types 1-4, respiratory syncytial virus (RSV), respiratory enteroviruses and rhinoviruses (RHV), human metapneumovirus (hMPV), human coronaviruses (HCoV) 229E, OC43, NL63 and HKU1, human bocavirus (HBoV) were detected from the specimens by real-time fluorescent PCR according to the methods described earlier 53-56 . Virus culture, identification, and molecular typing of HAdVs. The nasopharyngeal swab specimens from HAdV-positive patients were inoculated into HEp-2 cells and cytopathic effect (CPE) was monitored for ten days. The HAdVs were molecular typed by PCR amplification and sequencing of all seven hypervariable regions in the hexon gene 57 . Molecular types were determined by BLASTN of the assembling contigs against existing GenBank sequences. HAdV-7 whole-genome sequencing, annotation and phylogenetic analysis. Purified HAdV genomic DNA was submitted to Sagene Bio Co. (Guangzhou, China) for next generation whole-genome sequencing on Ion Torrent Personal Genome Machine (PGM TM ). In brief, genomic libraries were prepared using the Ion Xpress plus fragment library kit (Life Technologies) and barcoded adapters were linked to DNA fragments, then amplified by emulsion PCR with Ion PGM 200 Xpress Template kit (Life Technologies). After breaking the emulsion, and adding the sequencing primer and polymerase, the completely prepared Ion Sphere Particle (ISP) beads of seven samples were loaded into Ion 318 v2 chips and individually sequenced on Ion Torrent PGM, with 500 flows generating 6,063,042 reads with an average length of 182 bp, and a mean coverage of 2,409x. All the sequencing ladders were assembled with the Program to Assemble Spliced Alignments (PASA) tool 58 and SEQMAN software from the Lasergene package (DNAStar; Madison, WI). HAdV-7 strain 0901HZ/ShX/ CHN/2009 (35239 bps) was chosen as the reference genome. The PASA assembly command line is as followed: "%../scripts/Launch_PASA_pipeline.pl -c alignAssembly.config -C -R -g genome_sample.fasta -t all_sample. fasta.clean -T -u all_sample.fasta -f FL_acs.txt -ALIGNER blat gmap -CPU 2". The configuration file was input after "-c"; "genome_sample.fasta" represented the reference genome file name. Because we used PASA to assemble virus genomes with a reference genome, "-t" and "-u" parameters followed the same fasta file of virus sequencing data, despite different file names. "-ALIGNER" parameter followed two alignment methods, blat and gmap. The other parameters were set as default. Two CPUs were used to run the program simultaneously. Gaps and ambiguous sequences were PCR-amplified using different primers and re-sequenced by Sanger method for more than three-fold coverage. Both 5′ -and 3′ -ends including inverted terminal repeats were sequenced directly using genomic DNA as template, as described earlier 1 . The genome sequences were annotated based on the previous annotation of HAdV-7 prototype strain (Gomen; 1952) 59 and deposited into GenBank database. The Molecular Evolutionary Genetics Analysis (MEGA) version 6.06 software (http://www.megasoftware.net/megamac.php) was used for phylogenetic analyses of HAdV-7 hexon, fiber, penton genes, L1 52/55 kDa DNA packaging protein gene and the whole genomes, with additional sequences retrieved from GenBank database. Neighbor-joining phylogenetic trees with 1,000 boot-strap replicates were constructed using a maximum-composite-likelihood method with default parameters.
Genome type identification by in silico and agarose gel electrophoresis restriction endonuclease analysis (REA). Vector NTI Advance 11.5 (Invitrogen Corp.; San Diego, CA. USA) was used for the in silico Scientific RepoRts | 6:37216 | DOI: 10.1038/srep37216 Statistical analysis. Data statistical operations were performed by the SPSS program. Descriptive statistics was used for all variables; the continuous variables were summarized as medians and ranges, and the categorical variables were summarized as frequencies and proportions. Student's t-test, Chi square test or Fisher exact test was used where appropriate, to determine the difference between groups.