Since its first identification in Scotland, over 1,000 cases of unexplained paediatric hepatitis in children have been reported worldwide, including 278 cases in the UK1. Here we report an investigation of 38 cases, 66 age-matched immunocompetent controls and 21 immunocompromised comparator participants, using a combination of genomic, transcriptomic, proteomic and immunohistochemical methods. We detected high levels of adeno-associated virus 2 (AAV2) DNA in the liver, blood, plasma or stool from 27 of 28 cases. We found low levels of adenovirus (HAdV) and human herpesvirus 6B (HHV-6B) in 23 of 31 and 16 of 23, respectively, of the cases tested. By contrast, AAV2 was infrequently detected and at low titre in the blood or the liver from control children with HAdV, even when profoundly immunosuppressed. AAV2, HAdV and HHV-6 phylogeny excluded the emergence of novel strains in cases. Histological analyses of explanted livers showed enrichment for T cells and B lineage cells. Proteomic comparison of liver tissue from cases and healthy controls identified increased expression of HLA class 2, immunoglobulin variable regions and complement proteins. HAdV and AAV2 proteins were not detected in the livers. Instead, we identified AAV2 DNA complexes reflecting both HAdV-mediated and HHV-6B-mediated replication. We hypothesize that high levels of abnormal AAV2 replication products aided by HAdV and, in severe cases, HHV-6B may have triggered immune-mediated hepatic disease in genetically and immunologically predisposed children.
In March 2022, the report of five cases of severe hepatitis of unknown aetiology led to the UK Health Security Agency (UKHSA) identifying 278 cases in total as of 30 September 20221. Cases, defined as acute non-A–E hepatitis with serum transaminases of more than 500 IU in children under 10 years of age, were found to have been occurring since January 20222. In the UK, 196 cases required hospitalization, 69 were admitted to intensive care and 13 required liver transplantation1. Case numbers have declined since April 20223.
UKHSA investigations identified HAdV to be commonly associated with the unexplained paediatric hepatitis, with 64.7% (156 of 241) testing positive in one or more samples from whole blood (the most sensitive sample type4) or mucosal swabs. HAdVs from the blood of 35 of 77 patients were typed as F41. Seven of eight patients in England who required liver transplantation tested HAdV positive in blood samples, with F41 found in five of five genotyped2. SARS-CoV-2 infection was detected in 8.9% (15 of 169) of UK and 12.8% (16 of 125) of English cases2.
Given the uncertainty around the aetiology of this outbreak, and the potential that HAdV-F41, if implicated (Fig. 1a), could be a new or recombinant variant, we undertook untargeted metagenomic and metatranscriptomic sequencing of liver biopsies from five liver transplant cases and whole blood from five non-transplanted cases (Table 1 and Fig. 1b). The results were further verified by confirmatory PCRs of liver, blood, stool and nasopharyngeal samples from a total of 38 cases for which there was sufficient residual material. We compared our results with those from 13 healthy children and 52 previously healthy children presenting to hospital with other febrile illness, including HAdV, hepatitis unrelated to the current outbreak or a critical illness requiring admission to the intensive care unit. We also tested blood and liver biopsies from 17 profoundly immunosuppressed children with hepatitis who were not part of the current outbreak, in whom reactivation of latent infections might be expected.
We received samples from 38 children meeting the case definition (Table 1). All cases were less than 10 years of age and 22 of 23 cases previously tested were positive by HAdV PCR (Table 2, Extended Data Table 1 and Supplementary Table 1). A summary of the samples received from these cases and the investigations carried out on them are shown in Fig. 1b,c.
Pre-existing conditions, autoimmune, toxic and other infectious causes of hepatitis were excluded in 12 transplanted (cases 1–5, 28, 29, 31–34 and 36) and four non-transplanted (cases 30, 35, 37 and 38) children, investigated at two liver transplant units (Supplementary Table 1). The 12 transplanted cases reported gastrointestinal symptoms (nausea, vomiting and diarrhoea) preceding transplant by a median of 20 days (range 8–42 days). All 12 transplanted children survived, whereas the four children who did not receive liver transplants recovered without sequelae or evidence of chronic liver-related conditions. Five of the remaining 22 cases referred by Health Security Agencies, for whom this information was available, recovered without sequelae (Table 1 and Supplementary Table 1).
We performed metagenomic and metatranscriptomic sequencing on samples of frozen explanted liver tissue from five cases who received liver transplants (median age of 3 years) and six blood samples from five non-transplanted hepatitis cases (median age of 5 years) (Table 1 and Fig. 1b). The liver samples had uniform and consistently high sequencing depth both for DNA sequencing (DNA-seq) and RNA-seq, whereas the blood samples had variable sequencing depth particularly for RNA-seq (Supplementary Table 2). We detected5 abundant AAV2 reads in DNA-seq from five of five explanted livers and four of five blood samples from non-transplant cases (7–42 and 1.2–42 reads per million, respectively) (Table 2). Lower levels of HHV-6B were present in DNA-seq of all explanted liver samples (0.09–4 reads per million) but not in the six blood samples (Table 2). HAdV was detected (five reads) in one blood sample (Table 2).
Evidence of AAV2 replication
Metatranscriptomics revealed AAV2, but not HHV-6B or HAdV, RNA reads, in liver and blood samples (0.7–10 and 0–7.8 reads per million, respectively). Mapping liver RNA-seq data to the RefSeq AAV2 genome (NC_001401.2) identified high expression of the Cap open reading frame, particularly at the 3′ end of the capsid, suggesting viral replication6 (Extended Data Fig. 1a), whereas reverse transcription (RT)–PCR of two livers confirmed the presence of AAV2 mRNA from the Cap open reading frame (Extended Data Fig. 1c). In the blood samples, which had not been treated to preserve RNA, we detected low levels of AAV2 RNA reads mapping throughout the genome (Extended Data Fig. 1b).
Nanopore sequencing of explanted livers
Ligation-based untargeted nanopore sequencing was applied to DNA from four of five frozen liver samples. All four samples were initially sequenced at a lower depth (average N50 of 8.37 kb). Six to sixteen AAV2 reads were obtained from each sample (5.57–22.24 million total reads; Supplementary Table 3). Mapping revealed concatenation of the 4-kb genome, compatible with active AAV2 replication7. We observed alternating and head-to-tail concatemers, which could be consistent with both HAdV and human herpesvirus-mediated rolling hairpin and rolling circle replication, respectively8. Two of these samples were sequenced more deeply, resulting in 52 and 178 AAV2 reads in 82.9 and 122 million total (N50 of 4.40–8.52 kb) (Supplementary Table 3). Of the reads in the more deeply sequenced datasets, 42–48% comprised randomly linked, truncated and rearranged genomes, with few that were intact and of full length (Extended Data Fig. 2). The remaining reads were less than 3,000 bp long and may represent sections either of monomeric genomes or of more complex structures.
There was some evidence of AAV2 integration by deeper nanopore sequencing of explanted livers (Supplementary Table 3); however, none of the integration sites was confirmed by Illumina metagenomic or targeted AAV2 sequencing. The results are likely to represent artefacts of this library preparation method; chimeric reads have been described to occur in 1.7–3% of reads9,10. Given the number of human reads (72–120 million), we might expect to see this artefact occurring most commonly between AAV2 and human than between AAV2 reads.
Confirmatory real-time PCR
Where sufficient residual material was available, PCR tests were performed for AAV2 (28 of 38 cases), HAdV (31 of 38 cases) and HHV-6B (23 of 38 cases). The results confirmed high levels (cycle threshold (Ct) values: 17–21) of AAV2 DNA in all five frozen explanted livers that had undergone metagenomics (Table 2 and Fig. 2d), and lower levels of HHV-6B and HAdV DNA (Ct values: 27–32 and 37–42, respectively). AAV2 DNA was also detected (Ct values: 19–25) in blood samples from four of five cases that had undergone metagenomics, whereas HAdV, at levels too low to genotype, and HHV-6B were detected in two of four and three of four cases, respectively (one case had insufficient material) (Table 2). One of the blood metagenomics cases (case 9, JBB1) with insufficient material to test for HAdV and HHV-6B, tested positive for both viruses in the referring laboratory. The AAV2-negative blood sample (case 10, JBB15) was also negative for HAdV but positive for HHV-6B (Table 2). A further ten of ten blood samples tested from cases were positive for HAdV by PCR. Sufficient material was available for AAV2 PCR in six of these (all positive; Ct values: 20–23) and HHV-6B PCR in two (one positive Ct value: 37) (Extended Data Table 1).
AAV2 PCR was positive in nine formalin-fixed paraffin-embedded (FFPE) liver samples, including seven from transplanted cases (Ct values: 23–25) and two from non-transplanted cases (Ct values: 34–36; Extended Data Table 1). HHV-6B PCR was positive in six of seven FFPE samples (not case 32) from transplanted (Ct values: 30–37) and zero of two from non-transplanted (cases 30 and 35) cases, with positive HAdV (Ct values: 40–44) in four of nine cases. Three transplanted (cases 32, 34 and 36) and three non-transplanted (cases 35, 37 and 38) cases had serum available for testing. All were AAV2 positive (Ct values: 27–32) and HHV-6B negative, with one transplanted case and one non-transplanted case testing HAdV positive (Extended Data Table 1).
Together, 27 of 28 cases tested were AAV2 PCR positive, 23 of 31 were HAdV positive and 16 of 23 were HHV-6B positive. When results from referring laboratories were included, 33 of 38 cases were positive for HAdV and 19 of 26 cases were positive for HHV-6B (Table 2 and Extended Data Table 1).
Controls and comparators
To better contextualize the findings in cases with unexplained hepatitis, we selected control groups of children who were not part of the outbreak.
Blood from immunocompetent children
Whole blood from 65 immunocompetent children matched by age to cases (median age of 3.8 years) (Fig. 1b, Extended Data Table 2a and Supplementary Table 4) who were healthy, or had HAdV infection, hepatitis or critical illness, including requiring critical care, were selected from the PERFORM (personalised risk assessment in febrile illness to optimise real-life management; www.perform2020.org) and DIAMONDS (diagnosis and management of febrile illness using RNA personalised molecular signature diagnosis study; www.diamonds2020.eu) studies. Both studies recruited children presenting to hospital with an acute-onset febrile illness between 2017 and 2020 (PERFORM) and July 2020 to October 2021, during the COVID-19 pandemic (DIAMONDS) (Supplementary Table 4). Of the PERFORM–DIAMONDS control whole-blood samples, 6 of 65 (9.2%) were AAV2 PCR positive (Supplementary Table 5), compared with 10 of 11 (91%) whole-blood samples from cases (Fig. 2a; P = 8.466 × 10−8, Fisher’s exact test). AAV2 DNA levels were significantly higher in whole-blood samples from cases than from controls (Fig. 2e; P = 2.747 × 10−11, Mann–Whitney test).
One participant with an HAdV-F41-positive blood sample, originally thought to have unexplained paediatric hepatitis, was later found to have a previous condition that explained the hepatitis and was therefore reclassified as a control (referred to as ‘reclassified control’ or CONB40) (Supplementary Table 5). This blood sample was negative for AAV2 by PCR (Supplementary Table 5).
Liver from immunocompromised children
Frozen liver biopsy material from four immunocompromised children (median age of 10 years) (CONL1–4) who had been investigated for other forms of hepatitis was also tested (Fig. 1b and Extended Data Table 2b). In three children, liver enzyme levels were raised (Supplementary Table 6); no results were available for CONL4. AAV2 was detected in CONL3 (Ct value: 39) and HHV-6B was detected in CONL2 (Ct value: 34), whereas HAdV was negative (Fig. 2d and Supplementary Table 5).
Blood from immunocompromised comparators
We also tested immunocompromised children who are more likely to reactivate latent viruses. Whole-blood samples from 17 immunocompromised children (median age of 1 year) with raised levels of liver transaminases (AST/ALT of more than 500 IU) and viraemia (HAdV or cytomegalovirus), all sampled in 2022 (Fig. 1b), were tested for AAV2, HHV-6B and HAdV (Extended Data Table 2b and Supplementary Table 5). The majority had received human stem cell or solid organ transplants, and none was linked to the recent hepatitis outbreak (Extended Data Table 2b). Five of 15 (33%) whole-blood samples were positive for HHV-6B, whereas 6 of 17 (35%) were positive for AAV2, significantly fewer than in cases (P = 0.005957, Fisher’s exact test) and at significantly lower Ct levels (P = 6.517 × 10−5, Mann–Whitney test) (Fig. 2 and Supplementary Table 5). One HAdV-positive and AAV2-positive immunocompromised comparator (CONB23) was also positive for HHV-6B (Supplementary Table 5).
Four of the six AAV2-positive children from the PERFORM–DIAMONDS cohort (Fig. 2a and Supplementary Table 5) and all six of the AAV2-positive immunocompromised children (Fig. 2a and Supplementary Table 5) were also HAdV positive.
Viral whole-genome sequencing
One full HAdV-F41 genome sequence from the stool of one case (OP174926, case 22) (Supplementary Table 7) clustered phylogenetically with the HAdV-F41 sequence obtained from the reclassified control (CONB40) and with other HAdV-F41 sequences collected between 2015 and 2022, including 23 contemporaneous stool samples from children without the unexplained paediatric hepatitis (Figs. 1c and 3a). Sequencing and k-mer analysis11 of HAdV from 13 cases with partial sequences identified the genotype HAdV-F41 in 12 cases (Supplementary Tables 7 and 8). The partial sequences showed most similarity to the control sequence OP047699 (Supplementary Table 8), mapping across the entire viral genome, thus further excluding a recombinant virus.
Single-nucleotide polymorphisms were largely shared between the single HAdV-positive stool from a case (OP174926) and control whole-genome sequences (Extended Data Fig. 3a). Given reported mutation rates for HAdV-F41 and other adenoviruses12,13, any differences are likely to have arisen before the outbreak. No new or unique amino acid substitutions were noted in HAdV sequences from cases with only two substitutions overall (Extended Data Fig. 2d) and none in proteins critical for AAV2 replication.
AAV2 sequences from 15 cases, including five from the explanted livers and ten from whole blood from non-transplanted cases, clustered phylogenetically with control AAV2 sequences obtained from four immunocompromised HAdV-positive children with elevated levels of ALT in the comparator group (Extended Data Table 2b) and two healthy children with recent HAdV-F41 diarrhoea (Fig. 3b and Supplementary Table 9). The degree of diversity and lack of a unique common ancestor between case AAV2 genomes suggest that these are not specific to the hepatitis outbreak, but instead reflect the current viral diversity of the general population. Although comparison of the AAV2 sequences showed no difference between cases and controls, contemporary AAV2 sequences showed changes in the capsid compared with historic AAV2 (Extended Data Fig. 3c). None of these changes was shared with the hepatotropic AAV7 and AAV8 viruses (Extended Data Fig. 3b). The majority of the contemporary AAV2 genomes in cases and controls (20 of 21) contained a stop codon in the X gene, which is involved in viral replication14, whereas historic AAV2 genomes contained this less frequently (11 of 35). The significance, if any, of this is currently unknown.
Although mean read depths for four HHV-6B genomes recovered from explanted livers were low (×5–10) (Supplementary Table 12), phylogeny (Fig. 3c) confirmed that all were different.
Transduction of AAV2 capsid mutants
Using a recombinant AAV2 (rAAV2) vector with a VP1 sequence (Extended Data Fig. 4a) containing the consensus amino acid sequence from AAV2 cases (AAV2Hepcase) (Extended Data Fig. 3b), we generated functional rAAV particles that transduced Huh-7 cells with comparable efficacy to both canonical AAV2 and the synthetic liver-tropic LK03 AAV vector15. Unlike canonical AAV2, the AAV2Hepcase capsid, which contains mutations (R585S and R588T) that potentially affect the heparin sulfate proteoglycan (HSPG)-binding domain, was unaffected by heparin competition, a feature that is associated with increased hepatotropism16,17 (Extended Data Fig. 4b,c).
Histology and immunohistochemistry
Histological examination of the 12 liver explants and two liver biopsies showed nonspecific features of acute hepatitis with ballooning hepatocytes, disrupted liver architecture with varying degrees of perivenular, bridging or pan-acinar necrosis. There was no evidence of fibrosis suggestive of an underlying chronic liver disease. The appearances were similar to historic cases of seronegative hepatitis of unknown cause in children. There were no typical histological features of autoimmune hepatitis, notably no evidence of portal-based plasma cell-rich infiltrates. A cellular infiltrate was present in all cases, which on staining appeared to be predominantly of CD8+ T cells but also included CD20+ B cells. More widespread staining with the CD79a pan-B cell lineage, which also identifies plasma cells, was also observed (Extended Data Fig. 5). Macrophage lineage cells showed some C4d complement staining, whereas staining for immunoglobulins was nonspecific with disruption of the normal canalicular staining seen in controls due to the architectural collapse. MHC class I and class II staining, although increased in cases, was nonspecific and associated with sinusoid-containing blood cells and necrotic tissue (Extended Data Fig. 6a). No viral inclusions were observed and there were no features suggestive of direct viral cytopathic effect.
Immunohistochemistry was negative for adenovirus. Staining of the five explanted livers with AAV2 antibodies demonstrated evidence of nonspecific ingested debris but not the nuclear staining seen in the positive AAV2-infected cell lines and infected mouse tissue (Extended Data Fig. 6b). All five liver explants showed positive staining of macrophage-derived cells with antibody to HHV-6B, with no staining of negative control serial sections (Extended Data Fig. 6b). No specific HHV-6B staining was observed in 13 control liver biopsies from patients (including three children less than 18 years of age) with other viral hepatitis, toxic liver necrosis, autoimmune and other hepatitis, and normal liver. The control set was also negative for HAdV and AAV2 by immunohistochemistry.
Liver sections were morphologically suboptimal for electron microscopy, but no viral particles were identified in hepatocytes, blood vessel endothelial cells and Kupffer cells.
We quantified functional cytokine activity by expression of independently derived cytokine-inducible transcriptional signatures of cell-mediated immunity (Supplementary Table 11) in bulk genome-wide transcriptional profiles from four of the frozen explanted livers. Results were compared with published data from normal adult livers (n = 10) and adult hepatitis B-associated acute liver failure (n = 17) (GSE96851)18. Data from the unexplained hepatitis cases revealed increased expression of diverse cytokines and pathways compared with normal liver. These pathways included prototypic cytokines associated with T cell responses, including IFNγ, IL-2, CD40LG, IL-4, IL-5, IL-7, IL-13 and IL-15 (Fig. 4a and Supplementary Table 12), as well as some evidence of innate immune type I interferon responses. Many of these responses showed substantially greater activity in unexplained hepatitis than in fulminant hepatitis B virus disease. The most striking enrichment was for TNF expression, and included other canonical pro-inflammatory cytokines including IL-1 and IL-6 (Extended Data Fig. 7). These data are consistent with an inflammatory process involving multiple pathways.
Proteomic analysis of the five frozen explanted livers did not detect AAV2 or HAdV proteins. Expression of HHV-6B U4, a protein of unknown function, was found in four of five cases; U43, part of the helicase primase complex, was found in two of five cases; and U84, a homologue of cytomegalovirus UL117, implicated in HHV-6B nuclear replication, was found in two of five cases (Extended Data Fig. 8).
The human proteome from the five frozen liver explants was compared with publicly available data from seven control ‘normal’ livers, taken from two different studies19,20. Both protein and peptide analyses (Fig. 4b,c and Supplementary Tables 13 and 14) found increased expression in unexplained hepatitis cases of HLA class II proteins and peptides (for example, HLADRB1 and HLADRB4), multiple peptides from variable regions of the heavy and light chains of immunoglobulin, complement proteins (such as C1q) and intracellular and extracellular released proteins from neutrophils and macrophages (MMP8 and MPO).
There was no evidence of HAdV, AAV2 or HHV-6B in any of the control livers.
Despite reports implicating HAdV-F41 as causing the recent outbreak of unexplained paediatric hepatitis, we found very low levels of HAdV DNA, no proteins, inclusions or viral particles, including in explanted liver tissue from affected cases and no evidence of a change in the virus. By contrast, metagenomic and PCR analysis of liver tissue and blood identified high levels of DNA from AAV2, a member of the Dependoparvovirus genus, which has not been previously associated with clinical disease, in 27 of 28 cases. Replication of AAV2 requires co-infection with a helper virus, such as HAdV, herpesviruses or papillomavirus21, and can also be triggered in the laboratory by cellular damage22, raising the possibility that the AAV2 detected was a bystander of previous HAdV-F41 infection and/or liver damage. Against this, we found little or no AAV2 in blood from age-matched, immunocompetent children including those with HAdV infection, hepatitis or critical illness (Fig. 2d). AAV2 has been reported to establish latency in the liver23; however, even in critically ill immunosuppressed children with hepatitis in whom reactivation might occur, we detected AAV2 infrequently and at significantly lower levels in the blood or in liver biopsies (Fig. 2d,g).
RNA transcriptomic and real-time PCR data from explanted livers point to active AAV2 infection, although we did not detect AAV2 proteins by immunohistochemistry (Extended Data Fig. 6b) or proteomics (Extended Data Fig. 8) or any viral particles. The abundant AAV2 genomes in the explanted liver are concatenated with many complex and abnormal configurations. AAV genome concatenation may occur during AAV2 replication8, whereas abnormal AAV2 DNA complexes and rearrangements have been observed in the liver following AAV gene therapy7. Hepatitis following AAV gene therapy has been well described24,25,26, with deaths occurring, albeit rarely27. The pattern of complexes typify both HAdV and herpesvirus (including HHV-6B)-mediated AAV2 DNA replication6. The presence of HHV-6B DNA in 11 of 12 explanted livers, but not in livers (0 of 2) of non-transplanted children, or control livers as well as the expression, in 5 of 5 cases tested, of HHV-6B proteins, including U43, a homologue of the HSV1 helicase primase UL52, which is known to aid AAV2 replication, highlight a possible role for HHV-6B as well as HAdV in the pathogenesis of AAV2 hepatitis, particularly in severe cases. Although AAV2 is also capable of chromosomal integration28,29,30, we found little evidence of this by long read sequencing, computational analysis of metagenomics data or examination of unmapped reads, although further confirmatory studies may be required.
Although the pathogenesis of unexplained paediatric hepatitis and the role of AAV2 remain to be determined, our results point strongly to an immune-mediated process. Transcriptomic and proteomic data from the five explant livers identified significant immune dysregulation involving genes and proteins that are strongly associated with activation of B cells and T cells, neutrophils and macrophages as well as innate pathways. The findings are supported by immunohistochemical staining showing infiltration into liver tissue of CD8+, B cell and B cell lineage cells. Upregulation of canonical pro-inflammatory cytokines including lL-15, which has also been seen in a mouse model of AAV hepatitis31, IL-4 and TNF occurred at levels greater even than are seen in fulminant liver failure following infection with hepatitis B virus. Increased levels in the same immunoglobulin variable region peptides and corresponding proteins from both immunoglobulin heavy and light chains across all five livers point to specific antibody involvement32. HLA-DRB1*04:01 (12 of 13 cases tested) (Supplementary Table 1) among children in our study supports the same genetic predisposition as mooted in a parallel study conducted in Scotland33.
An immune-mediated process is consistent with studies of hepatitis following AAV gene therapy, in which raised AAV2 IgG and capsid specific cytotoxic T lymphocytes are observed in the affected patients; however, whether these directly mediate hepatitis remains unclear26,34. Although we did not find that AAV2 sequences in cases differed from those in AAV2 occurring as co-infections in HAdV-F41-positive stool collected from control children during the contemporary HAdV-F41 gastroenteritis outbreak (Fig. 3b), rAAV capsid expressing a consensus capsid sequence from the unexplained hepatitis cases (AAV2Hepcase) showed reduced HSPG dependency, compared with canonical AAV2 (Extended Data Fig. 4), while retaining hepatocyte transduction ability. This points to likely greater in vivo hepatotropism of currently circulating AAV2 than has hitherto been assumed from data on canonical AAV2 (ref. 17). Another member of the parvovirus family, equine parvovirus-hepatitis, has also been associated with acute hepatitis in horses (Theiler’s disease)35.
There are several limitations to our study. Although other known infectious, autoimmune, toxic and metabolic aetiologies3 have been excluded including by other studies36,37, the number of cases investigated here is small, the study is retrospective, the immunocompromised controls were not perfectly age-matched, and only one immunocompetent and 17 immunocompromised controls were sampled during exactly the same period as the outbreak. Age-matched, immunocompetent controls contemporaneous with the outbreak from the DIAMONDS study, although few in number, were however found to be AAV2 negative in a separate study carried out in Scotland33.
Finally, our data alone are not sufficient on their own to rule out a contribution from SARS-CoV-2 Omicron, the appearance of which preceded the outbreak of unexplained hepatitis (Supplementary Table 1). We did not detect SARS-CoV-2 metagenomically even in three participants who tested positive on admission. Moreover, although seropositivity was higher in our cases (15 of 20) than in controls (3 of 10), this was not the case for another UK cohort36 (38%) or in preliminary data from a UKHSA case–control study3, which showed similar SARS-CoV-2 antibody prevalence between unexplained hepatitis cases and population controls (less than 5 years of age: 60.5% versus 46.3%, respectively; 5–10 years of age: 66.7% versus 69.6%, respectively). In line with UK national recommendations at the time, none of the children had received a COVID vaccine.
Although we found little evidence for SARS-CoV-2 directly causing the hepatitis outbreak, we cannot exclude the effect of the COVID-19 pandemic on child mixing and infection patterns. The contemporaneous development of unexplained paediatric hepatitis with a national outbreak of HAdV-F41 (ref. 2) and the finding of HAdV-F41 in many cases suggest that the two are linked. Enteric HAdV infection is most common in those younger than 5 years of age2, and infection is influenced by mixing and hygiene38. Few cases of HAdV-F41 occurred between 2020 and 2022 and no major outbreaks were recorded2. The current HAdV outbreak followed relaxation of restrictions due to the pandemic and represented one of many infections, including other enteric pathogens that occurred in UK children following return to normal mixing39. Under normal circumstances, the levels of AAV2 antibodies are high at birth, subsequently declining to reach their lowest point at 7–11 months of age, increasing thereafter through childhood and adolescence40. AAV2 is known to spread with respiratory HAdVs, infections that declined during the COVID-19 pandemic, and has not been detected by us in over 30 SARS-CoV-2-positive nasopharyngeal aspirates (data not shown). We also found AAV2 DNA to be present in HAdV-F41-positive stool from both cases and controls (Supplementary Table 5). With loss of child mixing during the COVID-19 pandemic, reduced spread of common respiratory and enteric viral infections and no evidence of AAV2 in SARS-CoV-2-positive nasopharyngeal swabs, it is likely that immunity to both HAdV-F41 and AAV2 declined sharply in the age group affected by this unexplained hepatitis outbreak. Pre-existing antibody is known to reduce levels of AAV DNA in the liver of non-human primates following infusion of AAV gene therapy vectors41. The possibility that, in the absence of protective immunity, excessive replication of HAdV-F41 and AAV2 with accumulation of AAV2 DNA in the liver led to immune-mediated hepatic disease in genetically predisposed individuals needs further investigation. Evaluation of drugs that inhibit TNF and other cytokines massively elevated in this condition may identify important therapeutic options for future cases.
Metagenomic analysis and HAdV sequencing were carried out by the routine diagnostic service at Great Ormond Street Hospital (GOSH). Additional PCRs, immunohistochemistry and proteomics on samples received for metagenomics are part of the GOSH protocol for confirmation of new and unexpected pathogens. The use for research of anonymized laboratory request data, diagnostic results and residual material from any specimen received in the GOSH diagnostic laboratory, including all cases received from Birmingham’s Children Hospital UKHSA, Public Health Wales, Public Health Scotland as well as non-case samples from UKHSA, Public Health Scotland and GOSH research was approved by UCL Partners Pathogen Biobank under ethical approval granted by the NRES Committee London-Fulham (REC reference: 17/LO/1530).
Children undergoing liver transplant were consented for additional research under the International Severe Acute Respiratory and Emerging Infection Consortium (ISARIC) WHO Clinical Characterisation Protocol UK (CCP-UK) (ISRCTN 66726260) (RQ3001-0591, RQ301-0594, RQ301-0596, RQ301-0597 and RQ301-0598). Ethical approval for the ISARIC CCP-UK study was given by the South Central–Oxford Research Ethics Committee in England (13/SC/0149), the Scotland A Research Ethics Committee (20/SS/0028) and the WHO Ethics Review Committee (RPC571 and RPC572).
The UKHSA has legal permission, provided by regulation 3 of The Health Service (Control of Patient Information) Regulations 2002, to process patient confidential information for national surveillance of communicable diseases and, as such, individual patient consent is not required.
Control participants from the EU Horizon 2020 research and innovation program DIAMONDS–PERFORM (grant agreement nos. 668303 and 848196) were recruited according to the approved enrolment procedures of each study, and with the informed consent of parents or guardians: DIAMONDS (London-Dulwich Research Ethics Committee: 20/HRA/1714) and PERFORM (London-Central Research Ethics Committee: 16/LO/1684).
The sample IDs for the cases and controls are anonymized IDs that cannot reveal the identity of the study participants and are not known to anyone outside the research group, such as the patients or the hospital staff.
Initial diagnostic testing by metagenomics and PCR was performed at GOSH Microbiology and Virology clinical laboratories. Further WGS and characterization were performed at UCL.
Birmingham Children’s Hospital provided us with explanted liver tissue from five biopsy sites from five cases, five whole blood 500 µl from four cases and serum plasma from one case (Table 1 and Fig. 1b). These were used in metagenomics testing (Table 2), followed by HAdV, HHV-6 and AAV2 testing by PCR and, depending on the Ct value, WGS (Supplementary Tables 7, 9 and 10). We subsequently received 25 additional specimens from UKHSA, Public Health Wales and Public Health Scotland/Edinburgh Royal Infirmary, including 16 additional blood samples, four respiratory specimens and five stool samples, for HAdV WGS and, depending on residual material, for AAV2 PCR testing followed by sequencing (Tables 1 and 2, Fig. 1b and Supplementary Tables 7, 9 and 10). We also received ten FFPE liver biopsy samples and six serum samples from 11 cases from King’s College Hospital (Table 1). Of these cases, seven had received liver transplants.
Controls from DIAMONDS and PERFORM
PERFORM recruited children from ten EU countries (2016–2020). PERFORM was funded by the European Union’s Horizon 2020 programme under GA no. 668303.
DIAMONDS is funded by the European Union Horizon 2020 programme grant number 848196. Recruitment commenced in 2020 and is ongoing. Both studies recruited children presenting with suspected infection or inflammation and assigned them to diagnostic groups according to a standardized algorithm.
Controls from GOSH for PCR
Blood samples from 17 patients not linked to the non-A–E hepatitis outbreak were tested by real-time PCR targeting AAV2 (Extended Data Table 2b). These comparators were patients with ALT/AST of more than 500 and HAdV or cytomegalovirus viraemia. These were purified DNA from residual diagnostic specimens received in the GOSH microbiology and virology laboratory in the previous year. All residual specimens were stored at −80 °C before testing and pseudo-anonymized at the point of processing and analysis. Viraemia was initially detected using targeted real-time PCR during routine diagnostic testing with UKAS-accredited laboratory-developed assays that conform to ISO:15189 standards.
In addition to the blood samples, four residual liver biopsies from four control patients referred for investigation of infection were tested by AAV2 and HHV-6B PCR. The liver biopsies were submitted to the GOSH microbiology laboratory for routine diagnosis by bacterial broad-range 16S rRNA gene PCR or metagenomics testing in 2021 and 2022. Three of four control patients were known to have elevated levels of liver enzymes. Two adult frozen liver samples previously tested by metagenomics were negative for AAV2 and positive for HHV-6B (Supplementary Table 5).
Controls from UKHSA
We received a blood sample from one patient with elevated levels of liver enzymes and HAdV infection. We also received one control stool sample from Public Health Scotland/Edinburgh Royal Infirmary and 22 control stool samples for sequencing.
Controls from King’s College Hospital
A single FFPE liver biopsy control of normal marginal tissue from a hepatoblastoma from a child was negative for AAV2 and HAdV, but positive for HHV-6B (Ct = 37).
Controls from Queen Mary University of London
We received FFPE liver control samples from ten adults and three children (under 18 years of age) with other viral hepatitis, toxic liver necrosis, autoimmune and other hepatitis, and normal liver, from Queen Mary University of London. PCR gave valid results for samples from two children and eight adults, all of which were negative by PCR for AAV2 and HHV-6, apart from one adult sample, which was positive for HHV-6 at a high Ct value (Supplementary Table 5).
Nucleic acid purification
Frozen liver biopsies were infused overnight at −20 °C with RNAlater-ICE. Up to 20 mg biopsy was lysed with 1.4-mm ceramic, 0.1-mm silica and 4-mm glass beads, before DNA and RNA purification using the Qiagen AllPrep DNA/RNA Mini kit as per the manufacturer’s instructions, with a 30 µl elution volume for RNA and 50 µl for DNA.
Up to 400 µl whole blood was lysed with 0.5-mm and 0.1-mm glass beads before DNA and RNA purification on a Qiagen EZ1 instrument with an EZ1 virus mini kit as per the manufacturer’s instructions, with a 60 µl elution volume.
For quality assurance, every batch of samples was accompanied by a control sample containing feline calicivirus RNA and cowpox DNA, which was processed alongside clinical specimens, from nucleic acid purification through to sequencing. All specimens and controls were spiked with MS2 phage RNA internal control before nucleic acid purification.
Library preparation and sequencing
RNA from whole-blood samples with an RNA yield of more than 2.5 ng µl−1 and from biopsies underwent ribosomal RNA depletion and library preparation with KAPA RNA HyperPrep kit with RiboErase, according to the manufacturer’s instructions. RNA from whole blood with an RNA yield of less than 2.5 ng µl−1 did not undergo rRNA depletion before library preparation.
DNA from whole-blood samples with a DNA yield of more than 1 ng µl−1 and from biopsies underwent depletion of CpG-methylated DNA using the NEBNext Microbiome DNA Enrichment Kit, followed by library preparation with the NEBNext Ultra II FS DNA Library Prep Kit for Illumina, according to manufacturer’s instructions. DNA from whole blood with a DNA yield of less than 1 ng µl−1 did not undergo depletion of CpG-methylated DNA before library preparation.
Sequencing was performed with a NextSeq High output 150 cycle kit with a maximum of 12 libraries pooled per run, including controls.
Metagenomics data analysis
An initial quality control step was performed by trimming adapters and low-quality ends from the reads (Trim Galore!42 0.3.7). Human sequences were then removed using the human reference GRCH38 p.9 (Bowtie2 (ref. 43), version 2.4.1) followed by removal of low-quality and low-complexity sequences (PrinSeq44, version 0.20.3). An additional step of human sequences removal followed (megaBLAST45, version 2.9.0). For RNA-seq, rRNA sequences were also removed using a similar two-step approach (Bowtie2 and megaBLAST). Finally, nucleotide similarity and protein similarity searches were performed (megaBLAST and DIAMOND46 (version 0.9.30), respectively) against custom reference databases that consisted of nucleotide and protein sequences of the RefSeq collections (downloaded March 2020) for viruses, bacteria, fungi, parasites and human.
DNA and RNA sequence data were analysed with metaMix5 (version 0.4) nucleotide and protein analysis pipelines.
metaMix resolves metagenomics mixtures using Bayesian mixture models and a parallel Markov chain Monte Carlo search of the potential species space to infer the most likely species profile.
metaMix considers all reads simultaneously to infer relative abundances and probabilistically assign the reads to the species most likely to be present. It uses an ‘unknown’ category to capture the fact that some reads cannot be assigned to any species. The resulting metagenomic profile includes posterior probabilities of species presence as well as Bayes factor for presence versus absence of specific species. There are two modes: metaMix-protein, which is optimal for RNA virus detection, and metaMix-nucl, which is best for speciation of DNA microorganisms. Both modes were used for RNA-seq, whereas metaMix-nucl was used for DNA-seq.
For sequence results to be valid, MS2 phage RNA had to be detected in every sample and feline calicivirus RNA and cowpox DNA, with no additional unexpected organisms, detected in the controls.
Confirmatory mapping of AAV2
The RNA-seq reads were mapped to the AAV2 reference genome (NCBI reference sequence NC_001401) using Bowtie2, with the –very-sensitive option. Samtools47 (version 1.9) and Picard (version 2.26.9; http://broadinstitute.github.io/picard/) were used to sort, deduplicate and index the alignments, and to create a depth file, which was plotted using a custom script in R.
De novo assembly of unclassified reads
We performed a de novo assembly step with metaSPADES48 (v3.15.5), using all the reads with no matches to the nucleotide database that we used for our similarity search. A search using megaBLAST with the standard nucleotide collection was carried out on all resulting contigs over 1,000 bp in length. All of the contigs longer than 1,000 bp matched to human, except two that mapped to Torque Teno virus.
DNA from up to 20 mg of liver was purified using the Qiagen DNeasy Blood & Tissue kit as per the manufacturer’s instructions. Samples with limited amount of DNA were fragmented to an average size of 10 kb using a Megaruptor 3 (Diagenode) to reach an optimal molar concentration for library preparation. Quality control was perform using a Femto Pulse System (Agilent Technologies) and a Qubit fluorometer (Invitrogen). Samples were prepared for Nanopore sequencing using the ligation sequencing kit SQK-LSK110. DNA was sequenced on a PromethION using R9.4.1 flowcells (Oxford Nanopore Technologies). Samples were run for 72 h including a washing and reload step after 24 h and 48 h.
All library preparation and sequencing were performed by the UCL Long Read Sequencing facility.
Passed reads from Minknow were mapped to the reference AAV2 genome (NC_001401) using minimap2 (ref. 49) using the default parameters. Reads were trimmed of adapters using Porechop v0.2.4 (https://github.com/rrwick/Porechop/), with the sequences of the adapters used added to adapters.py, and using an adapter threshold of 85. Reads that also mapped by minimap to the human genome (Ensemble GRCh38_v107), which could be ligation artefacts, were excluded from further analysis. The passed reads were also classified using Kraken2 (ref. 50) with the PlusPF database (17 May 2021). The data relating to AAV2 reads in Supplementary Table 3 refer to reads that were classified as AAV2 by both minimap2 and Kraken2 (version 2.0.8-beta), as the results from both methods were similar. Four reads across all four lower-depth samples were classified as HHV-6B by the EPI2ME WIMP51 pipeline. No reads were classified as HAdV or HHV-6B by Kraken2 in the two higher-depth samples. Alignment dot plots were created for the AAV2 reads using redotable (version 1.1)52, with a window size of 20. These were manually classified into possible complex and monomeric structures.
Integration analysis of Illumina data
We investigated potential integrations of AAV2 and HHV-6 viruses into the genome using the Illumina metagenomics data for five liver transplant cases. We first processed the pair-end reads (average sequence coverage per genome = 5×), quality checking using FastQC53, with barcode and adaptor sequence trimmed by TrimGalore (phred-score = 20). Potential viral integrations were investigated with Vseq-Toolkit54 (mode 3 with default settings except for high stringency levels). Predicted genomic integrations were visualized with IGV55, requiring at least three reads supporting an integration site, spanning both human and viral sequences. Predicted integrations were supported by only one read, thus not fulfilling the algorithm criteria. Sequencing was performed at a lower depth than optimal for integration analysis, but no evidence was found for AAV2 or HHV-6B integration into the genomes of cases.
Real-time PCR targeting a 62-nt region of the AAV2 inverted terminal repeat sequence was performed using primers and probes previously described56. This assay has been predicted to amplify AAV2 and AAV6. The Qiagen QuantiNova probe PCR kit (PERFORM and DIAMONDS controls) or the Qiagen Quantifast probe PCR kit (all other samples) were used. Each 25-µl reaction consisted of 0.1 µM forward primer, 0.34 µM reverse primer and 0.1 µM probe with 5 µl template DNA.
Real-time PCR targeting a 74-bp region of the HHV-6 DNA polymerase gene was performed using primers and probes previously described57 multiplexed with an internal positive control targeting mouse (mus) DNA spiked into each sample during DNA purification, as previously described58. In brief, each 25-µl reaction consisted of 0.5 µM of each primer, 0.3 µM HHV-6 probe, 0.12 µM of each mus primer, 0.08 µM mus probe and 12.5 µl Qiagen Quantifast Fast mastermix with 10 µl template DNA.
Real-time PCR targeting a 132-bp region of the HAdV hexon gene was performed using primers and probes previously described59 multiplexed with an internal positive control targeting mouse (mus) DNA spiked into each sample during DNA purification, as previously described58. In brief, each 25-µl reaction consisted of 0.6 µM of each HHV-6 primer, 0.4 µM HHV-6 probe, 0.12 µM of each mus primer, 0.08 µM mus probe and 12.5 µl Qiagen Quantifast Fast mastermix with 10 µl template DNA.
PCR cycling for all targets, apart from the controls from the PERFORM and DIAMONDS studies, was performed on an ABI 7500 Fast thermocycler and consisted of 95 °C for 5 min followed by 45 cycles of 95 °C for 30 s and 60 °C for 30 s. For the PERFORM and DIAMONDS controls, PCR was performed on a StepOnePlus Real-Time PCR System and consisted of 95 °C for 2 min followed by 45 cycles of 95 °C for 5 s and 60 °C for 10 s. Each PCR run included a no template control and a DNA-positive control for each target.
Neat DNA extracts of the FFPE material were inhibitory to PCR, so PCR results shown were performed following a 1 in 10 dilution.
AAV2 quantitative PCR with reverse transcription
RNA samples were treated with the Turbo-DNA free kit (Thermo) to remove residual genomic DNA. Complementary DNA (cDNA) was synthesized using the QuantiTect Reverse Transcription kit. In brief, 12 µl of RNA was mixed with 2 µl of genomic DNA Wipeout buffer and incubated at 42 °C for 2 min and transferred to ice. For reverse transcription, 6 µl mastermix was used and incubated at 42 °C for 20 min followed by 3 min at 95 °C.
Real-time PCR targeting a 120-nt region of the AAV2 cap open reading frame sequence was performed using primers AAV2_cap _Fw- ATCCTTCGACCACCTTCAGT, AAV2_cap _Rv-GATT CCAGCGTTTGCTGTT and the probe AAV2_cap _Pr FAM-ACACAGTAT/ZEN/TCC ACGG GACAGGT-IBFQ. This assay has been predicted to amplify AAV2 and AAV6. The Qiagen QuantiNova probe PCR kit was used. Each 25-µl reaction consisted of 0.1 µM forward primer, 0.1 µM reverse primer and 0.2 µM probe with 2.5 µl template cDNA.
PCR was performed on a StepOnePlus Real-Time PCR System and consisted of incubation at 95 °C for 2 min followed by 45 cycles of 95 °C for 5 s and 60 °C for 10 s. Each PCR run included a no template control, a DNA-positive control and a RNA control from each sample to verify efficient removal of genomic DNA.
All immunohistochemistry was done on FFPE tissue cut at a thickness of 3 µm.
AdV immunohistochemistry was carried out using the Ventana Benchmark ULTRA, Optiview Detection Kit, PIER with protease 1 for 4 min and antibody incubation for 32 min (AdV clone 2/6 and 20/11, Roche, 760-4870, pre-diluted). The positive control was a known HAdV-positive gastrointestinal surgical case.
Preparation of AAV2-positive controls
The plasmid used for transfection was pAAV2/2 (addgene, plasmid #104963; https://www.addgene.org/104963/), which expresses the genes encoding Rep/Cap of AAV2. This was delivered by tail-vein hydrodynamic injection60 into albino C57BL/6 mice (5 mg in 2 ml PBS). Negative controls received PBS alone. At 48 h, mice were terminally exsanguinated and perfused by PBS. Livers were collected into 10% neutral buffered formalin (CellPath UK). This was performed under Home Office License PAD4E6357.
AAV2 immunohistochemistry was carried out with four commercially available antibodies:
Leica Bond-III, Bond Polymer Refine Detection Kit with DAB Enhancer, HIER with Bond Epitope Retrieval Solution 1 (citrate based pH 6) for 30 min and antibody incubation for 30 min (anti-AAV VP1/VP2/VP3 clone B1, PROGEN, 690058S, 1:100).
Leica Bond-III, Bond Polymer Refine Detection Kit with DAB Enhancer, HIER with Bond Epitope Retrieval Solution 1 (citrate based pH 6) for 40 min and antibody incubation for 30 min (anti-AAV VP1/VP2/VP3 rabbit polyclonal, OriGene, BP5024, 1:100).
Leica Bond-III, Bond Polymer Refine Detection Kit with DAB Enhancer, HIER with Bond Epitope Retrieval Solution 1 (citrate based pH 6) for 40 min and antibody incubation for 30 min (anti-AAV VP1 clone A1, OriGene, BM5013, 1:100).
Leica Bond-III, Bond Polymer Refine Detection Kit with DAB Enhancer, HIER with Bond Epitope Retrieval Solution 1 (citrate based pH 6) for 40 min and antibody incubation for 30 min (anti-AAV VP1/VP2 clone A69, OriGene, BM5014, 1:100).
HHV-6 immunohistochemistry straining was carried out with:
Leica Bond-III, Bond Polymer Refine Detection Kit with DAB Enhancer, PIER with Bond Enzyme 1 Kit for 10 min and antibody incubation for 30 min (mouse monoclonal antibody (C3108-103) to HHV-6, ABCAM, ab128404, 1:100).
Negative reagent control slides were stained using the same antigen retrieval conditions and staining protocol incubation times using only BondTM Primary Antibody Diluent #AR9352 for the antibody incubation.
Samples of liver were fixed in 2.5% glutaraldehyde in 0.1 M cacodylate buffer followed by secondary fixation in 1.0% osmium tetroxide. Tissues were dehydrated in graded ethanol, transferred to an intermediate reagent, propylene oxide and then infiltrated and embedded in Agar 100 epoxy resin. Polymerization was undertaken at 60 °C for 48 h. Ultrathin sections of 90 nm were cut using a Diatome diamond knife on a Leica UC7 ultramicrotome. Sections were transferred to copper grids and stained with alcoholic urynal acetate and Reynold’s lead citrate. The samples were examined using a JEOL 1400 transmission electron microscope. Images were captured on an AMT XR80 digital camera.
To produce the capture probes for hybridization, biotinylated RNA oligonucleotides (baits) used in the SureSelectXT protocols for HAdV and HHV-6 WGS were designed in-house using Agilent community design baits with part numbers 5191-6711 and 5191-6713, respectively. They were synthesized by Agilent Technologies (2021) (available through Agilent’s Community Designs programme: SSXT CD Pan Adenovirus and SSXT CD Pan HHV-6 and used previously61,62).
Library preparation and sequencing
For WGS of HAdV and HHV-6B, DNA (bulked with male human genomic DNA (Promega) if required) was sheared using a Covaris E220 focused ultrasonication system (PIP 75, duty factor of 10, 1,000 cycles per burst). End-repair, non-templated addition of 3′ poly A, adapter ligation, hybridization, PCR (pre-capture cycles dependent on DNA input and post-capture cycles dependent on viral load) and all post-reaction clean-up steps were performed according to either the SureSelectXT Low Input Target Enrichment for Illumina Paired-End Multiplexed Sequencing protocol (version A0), the SureSelectXT Target Enrichment for Illumina Paired-End Multiplexed Sequencing protocol (version C3) or the SureSelectXTHS Target Enrichment using the Magnis NGS Prep System protocol (version A0) (Agilent Technologies). Quality control steps were performed on the 4200 TapeStation (Agilent Technologies). Samples were sequenced using the Illumina MiSeq platform. Base calling and sample demultiplexing were performed as standard for the MiSeq platform, generating paired FASTQ files for each sample. A negative control was included on each processing run. A targeted enrichment approach was used due to the predicted high variability of the HHV-6 and HAdV genomes.
For AAV2 WGS, an AAV2 primer scheme was designed using primalscheme63 with 17 AAV2 sequences from NCBI and one AAV2 sequence provided by GOSH from metagenomic sequencing of a liver biopsy DNA extract as the reference material. These primers amplify 15 overlapping 400-bp amplicons. Primers were supplied by Merck. Two multiplex PCRs were prepared using Q5 Hot Start High-Fidelity 2X Master Mix, with a 65 °C, 3 min annealing/extension temperature. Pools 1 and 2 multiplex PCRs were run for 35 cycles. Of each PCR, 10 µl was combined and 20 µl nuclease-free water was added. Libraries were prepared either manually or on the Agilent Bravo NGS workstation option B, following a reduced-scale version of the Illumina DNA protocol as used in the CoronaHiT protocol64. Equal volumes of the final libraries were pooled, bead purified and sequenced on the Illumina MiSeq. A negative control was included on each processing run.
All library preparation and sequencing were performed by UCL Genomics.
AAV2 sequence analysis
The raw fastq reads were adapted, trimmed and low-quality reads were removed. The reads were mapped to the NC_001401 reference sequence and then the amplicon primers regions were trimmed using the location provided in a bed file. Consensus sequences were then called at a minimum of 10× coverage. The entire processing of raw reads to consensus was carried out using the nf-core/viralrecon pipeline (https://nf-co.re/viralrecon/2.4.1; https://doi.org/10.5281/zenodo.3901628). Basic quality metrics for the samples sequenced are in Supplementary Table 9. All samples that gave 10× genome coverage over 90% were then used for further phylogenetic analysis. Samples were aligned along with known reference strains from GenBank using MAFFT65 (version v7.271), and the trees were built with IQ-TREE66 (multicore version 1.6.12) with 1,000 rapid bootstraps and approximate likelihood-ratio test support. The samples were then labelled based on type and provider on the trees (Fig. 3a).
For each AAV2 sample, we aligned the consensus nucleotide sequence to the AAV2 reference sequence. From these alignments, the exact coordinates of the sample capsid were determined. We then used the coordinates to extract the corresponding nucleotide sequence and translated it to find the amino acid sequence. Next, we compared each sample to the reference to identify amino acid changes. Amino acid sequences from AAV capsid sequences were retrieved from GenBank for AAV1 to AAV12. Amino acid sequences of capsid constructs designed to be more hepatotropic were retrieved from refs. 16,67. These sequence sets were then aligned to the AAV2 reference sequence using MAFFT65. We then compared each construct to the AAV2 reference to identify amino acid changes present, while retaining the AAV2 coordinate set.
HAdV and HHV-6B sequence analysis
Raw data quality control was performed using trim-galore (v.0.6.7) on the raw FASTQ files.
For HHV-6B, short reads were mapped with BWA mem68 (0.7.17-r1188) using the RefSeq reference NC_000898.
For HAdV, genotyping is performed using AYUKA11 (version 22-111). This novel tool is used to confidently assign one or more HAdV genotypes to a sample of interest, assessing inter-genotype recombination if more than one genotype is detected. The results from this screening step guide which downstream analyses are performed and which reference genome (or genomes) is used. If mixed infection is suspected, reads are separated using bbsplit (https://sourceforge.net/projects/bbmap/), and each genotype is analysed independently as normal. If recombination is suspected, a more detailed analysis is performed using Recombination Detection Program (RDP) and the sample is excluded from phylogenetic analysis. After genotyping, the cleaned read data are mapped using BWA to the relevant reference sequence (or sequences), and SNPs and small insertions and deletions are called using bcftool (version1.15.1, https://github.com/samtools/bcftools) and a consensus sequence is generated also with bcftools, masking with Ns positions that do not have enough read support (15× by default). Consensus sequences generated with the pipeline are then concatenated to previously sequenced samples and a multiple sequence alignment is performed using the G-INS-I algorithm in the MAFFT software (MAFFT G-INS-I v7.481). The multiple sequence alignment is then used for phylogenetic analysis with IQ-TREE (IQ-TREE 2 2.2.0), using modelfinder and performing 1,000 rapid bootstraps.
Proteomics data generation
Liver explant tissue from cases was homogenized in lysis buffer, 100 mM Tris (pH 8.5), 5% sodium dodecyl sulfate, 5 mM tris(2-carboxyethyl)phosphine and 20 mM chloroacetamide then heated at 95 °C for 10 min and sonicated in an ultrasonic bath for another 10 min. The lysed proteins were quantified with NanoDrop 2000 (Thermo Fisher Scientific). One-hundred micrograms was precipitated with the methanol/chloroform protocol and then protein pellets were reconstituted in 100 mM Tris (pH 8.5) and 4% sodium deoxycholate (SDC). The proteins were subjected to proteolysis with 1:50 trypsin overnight at 37 °C with constant shaking. Digestion was stopped by adding 1% trifluoroacetic acid to a final concentration of 0.5%. Precipitated SDC was removed by centrifugation at 10,000g for 5 min, and the supernatant containing digested peptides was desalted on an SOLAµ HRP (Thermo Fisher Scientific). Of the desalted peptide, 50 µg was then fractionated on Vanquish HPLC (Thermo Fisher Scientific) using a Acquity BEH C18 column (2.1 × 50 mm with 1.7-µm particles from Waters): buffer A was 10 mM ammonium formiate at pH 10, whereas buffer B was 80% acetonitrile and the flow was set to 500 µl per minute. We used a gradient of 8 min to collect 24 fractions that were then concatenated to obtain 12 fractions. These 12 fractions were dried and dissolved in 2% formic acid before liquid chromatography–tandem mass spectrometry analysis. An estimated total of 2,000 ng from each fraction was analysed using an Ultimate3000 high-performance liquid chromatography system coupled online to an Eclipse mass spectrometer (Thermo Fisher Scientific). Buffer A consisted of water acidified with 0.1% formic acid, whereas buffer B was 80% acetonitrile and 20% water with 0.1% formic acid. The peptides were first trapped for 1 min at 30 μl per minute with 100% buffer A on a trap (0.3 mm × 5 mm with PepMap C18, 5 μm, 100 Å; Thermo Fisher Scientific); after trapping, the peptides were separated by a 50-cm analytical column (Acclaim PepMap, 3 μm; Thermo Fisher Scientific). The gradient was 9–35% buffer B for 103 min at 300 nl per minute. Buffer B was then raised to 55% in 2 min and increased to 99% for the cleaning step. Peptides were ionized using a spray voltage of 2.1 kV and a capillary heated at 280 °C. The mass spectrometer was set to acquire full-scan mass spectrometry spectra (350:1,400 mass:charge ratio) for a maximum injection time set to auto at a mass resolution of 120,000 and an automated gain control target value of 100%. For a second, the most intense precursor ions were selected for tandem mass spectrometry. Higher energy collisional dissocation (HCD) fragmentation was performed in the HCD cell, with the readout in the Orbitrap mass analyser at a resolution of 15,000 (isolation window of 3 Th) and an automated gain control target value of 200% with a maximum injection time set to auto and a normalized collision energy of 30%. All raw files were analysed by MaxQuant69 v2.1 software using the integrated Andromeda search engine and searched against the Human UniProt Reference Proteome (February release with 79,057 protein sequences) together with UniProt-reported AAV proteins and specific fasta created using EMBOSS Sixpack translating patient’s virus genome. MaxQuant was used with the standard parameters with only the addition of deamidation (N) as variable modification. Data analysis was then carried out with Perseus70 v2.05: proteins reported in the file ‘proteinGroups.txt’ were filtered for reverse and potential contaminants. Figures were created using Origin pro version 2022b.
Transduction of AAV2 capsid mutants
A transgene sequence containing enhanced green fluorescent protein (eGFP) was packaged into rAAV2 particles to track their expression in transduced cells, compared with rAAV capsids derived from canonical AAV2, AAV9 and a synthetic liver-tropic AAV vector called LK03 (ref. 15).
rAAV vector particles were delivered to Huh-7 hepatocytes at a multiplicity of infection of 100,000 vector genomes per cell before analysing eGFP expression by flow cytometry 72 h later.
Recombinant AAV capsid sequence
The VP1 sequence was generated by generating a consensus sequence from a multiple sequence alignment of sequenced AAV2 genomes derived from patient samples, using the Biopython71 package AlignIO. The designed VP1 sequence was then synthesized as a ‘gBlock’ (Integrated DNA Technologies) and incorporated into an AAV2 RepCap plasmid (AAV2/2 was a gift from M. Fan, Addgene plasmid #104963) between the SwaI and XmaI restriction sites, using InFusion cloning reagent (product 638948, Clontech).
AAV vector production
rAAV particles were generated by transient transfection of HEK 293T cells as previously described72. In brief, 1.8 × 107 cells were plated in 15-cm dishes before transfecting the pAAV-CAG-eGFP transgene plasmid (a gift from E. Boyden, Addgene plasmid #37825), the relevant RepCap plasmid and the pAdDeltaF6 helper plasmid (a gift from J. M. Wilson, Addgene plasmid #112867), at a ratio of 10.5 µg, 10.5 µg and 30.5 µg, respectively, using PEIPro transfection reagent (PolyPlus) at a ratio of 1 µl per 1 µg DNA. Seventy-two hours post-transfection, cell pellets and supernatant were harvested and rAAV particles were purified using an Akta HPLC platform. rAAV particle genome copy numbers were calculated by quantitative PCR targeting the vector transgene region. The rAAV2 vector used in this study was purchased as ready-to-use AAV2 particles from Addgene (Addgene viral prep #37825-AAV2).
Analysis of rAAV transduction
Huh-7 hepatocytes (a gift from J. Baruteau, UCL) were plated in DMEM medium supplemented with 10% FBS and 1% penicillin–streptomycin supplement. The cell line was validated by testing for glypican-3 and was not tested for mycoplasma contamination. Cells were plated at a density of 1.5 × 103 cells per square centimetre and transduced with 1 × 105 viral genomes per cell. Transductions were performed in the presence or absence of 400 µg ml−1 heparin, which was supplemented directly to cell media. Seventy-two hours after transduction, cells were analysed by microscopy using an EVOS Cell Imaging System (Thermo Fisher Scientific) before quantifying eGFP expression by flow cytometry using a Cytoflex Flow Cytometer (Beckman). eGFP-positive cells were determined by gating the live-cell population and quantifying the level of eGFP signal versus untransduced controls.
Human short-read data analysis
Cytokine transcriptomics analysis
Cytokine inducible gene expression modules were derived from previously published bulk tissue genome-wide transcriptomes of the tuberculin skin test that have been shown to reflect canonical human in vivo cell-mediated immune pathways73 using a validated bioinformatic approach74. Cytokine regulators of genes enriched in the tuberculin skin73 test (ArrayExpress accession number E-MTAB-6816) were identified using Ingenuity Pathway Analysis (Qiagen). Average correlation of log2-transformed transcripts per million data for every gene pair in each of the target gene modules were compared with 100 iterations of randomly selected gene modules of the same size, to select cytokine-inducible modules that showed significantly greater co-correlation (adjusted P < 0.05), representing co-regulated transcriptional networks for each 59 cytokines. We then used the average log2-transformed transcripts per million expression of all the genes in each of these co-regulated modules to quantify the biological activity of the associated upstream cytokine within bulk genome-wide transcriptional profiles from AAV2-associated hepatitis (n = 4) obtained in the present study, compared with published log2-transformed and normalized microarray data from normal adult liver (n = 10) and hepatitis B adult liver (n = 17) (Gene Expression Omnibus accession number GSE96851)18. To enable comparison across the datasets, we transformed average gene expression values for each cytokine-inducible module to standardized (Z scores) using mean and standard deviation of randomly selected gene sets of the same size within each individual dataset. Statistically significant differences in Z scores between groups were identified by Student’s t-tests with multiple testing correction (adjusted P < 0.05).
Proteomics differential expression
To compare the proteomics data from the explanted livers of cases with data from healthy livers, we downloaded the raw files from two studies19,20 from PRIDE. The raw files were searched together with our files using the same settings and databases.
We performed differential expression analyses at the protein level and peptide level using a hybrid approach including statistical inference on the abundance (quantitative approach), as well as the presence or absence (binary approach) of proteins or peptides. DEP R package version 1.18.0 was used for quantitative analysis75. Proteins or peptides were filtered for those detected in all replicates of at least one group (case or control). The data were background corrected and variance was normalized using variance-stabilizing transformation. Missing intensity values were not distributed randomly and were biased to specific samples (either cases or controls). Therefore, for imputing the missing data, we applied random draws from a manually defined left-shifted Gaussian distribution using the DEP impute function with parameters fun:“man”, shift:1.8 and scale:0.3. The test_diff function based on linear models and the empirical Bayes method was used for testing differential expressions between the case and control samples.
HLA typing methods
Typing was undertaken in the liver centre units. Next-generation sequencing (sequencing by synthesis (Illumina) using AllType kits (VHBio/OneLambda), a high-resolution HLA typing method, was used.
Fisher’s exact test and two-sided Wilcoxon (Mann–Whitney) non-parametric rank sum test were used for differences between case and control groups. Where multiple groups were compared, Kruskal–Wallis tests followed by Wilcoxon pairwise tests using a Benjamini–Hochberg correction were performed. All analysis were performed in R version 4.2.0.
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
The consensus genomes from viral WGS data are deposited in GenBank. IDs can be found in Supplementary Table 7 (HAdV), Supplementary Table 9 (AAV2) and Supplementary Table 10 (HHV6). The MS proteomics data have been deposited in the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD035925.
The code for metagenomics and PCR analysis can be found at https://github.com/sarah-buddle/unknown-hepatitis. The transcriptomics analysis code is available at https://github.com/innate2adaptive/Bulk-RNAseq-analysis/tree/main/Zscore_gene_expression_module_analysis. The proteomics differential expression analysis code can be found at https://github.com/MahdiMoradiMarjaneh/proteomics_and_transcriptomics_of_hepatitis.
ECDC–WHO. Joint ECDC–WHO Regional Office for Europe hepatitis of unknown origin in children surveillance bulletin. ECDC https://cdn.ecdc.europa.eu/novhep-surveillance/ (2022).
UKHSA. Investigation into acute hepatitis of unknown aetiology in children in England: technical briefing 3 (UKHSA, 2022).
UKHSA. Investigation into acute hepatitis of unknown aetiology in children in England: technical briefing 4 (UKHSA, 2022).
Marsh, K. et al. Investigation into cases of hepatitis of unknown aetiology among young children, Scotland, 1 January 2022 to 12 April 2022. Euro Surveill. 27, 2200318 (2022).
Morfopoulou, S. & Plagnol, V. Bayesian mixture analysis for metagenomic community profiling. Bioinformatics 31, 2930–2938 (2015).
Berns, K. I. Parvovirus replication. Microbiol. Rev. 54, 316–329 (1990).
Sun, X. et al. Molecular analysis of vector genome structures after liver transduction by conventional and self-complementary adeno-associated viral serotype vectors in murine and nonhuman primate models. Hum. Gene Ther. 21, 750–762 (2010).
Meier, A. F. et al. Herpes simplex virus co-infection facilitates rolling circle replication of the adeno-associated virus genome. PLoS Pathog. 17, e1009638 (2021).
Xu, Y. et al. Detection of viral pathogens with multiplex nanopore MinION sequencing: be careful with cross-talk. Front. Microbiol. 9, 2225 (2018).
Eccles, D., White, R., Pellefigues, C., Ronchese, F. & Lamiable, O. Investigation of chimeric reads using the MinION. F1000Res. 6, 631 (2017).
Guerra-Assunção, J. A., Goldstein, R. & Breuer, J. AYUKA: a toolkit for fast viral genotyping using whole genome sequencing. Preprint at bioRxiv https://doi.org/10.1101/2022.09.07.506755 (2022).
Risso-Ballester, J., Cuevas, J. M. & Sanjuán, R. Genome-wide estimation of the spontaneous mutation rate of human adenovirus 5 by high-fidelity deep sequencing. PLoS Pathog. 12, e1006013 (2016).
Liu, L. et al. Genetic diversity and molecular evolution of human adenovirus serotype 41 strains circulating in Beijing, China, during 2010–2019. Infect. Genet. Evol. 95, 105056 (2021).
Cao, M., You, H. & Hermonat, P. L. The X gene of adeno-associated virus 2 (AAV2) is involved in viral DNA replication. PLoS ONE 9, e104596 (2014).
Lisowski, L. et al. Selection and evaluation of clinically relevant AAV variants in a xenograft liver model. Nature 506, 382–386 (2014).
Cabanes-Creus, M. et al. Novel human liver-tropic AAV variants define transferable domains that markedly enhance the human tropism of AAV7 and AAV8. Mol. Ther. Methods Clin. Dev. 24, 88–101 (2022).
Cabanes-Creus, M. et al. Restoring the natural tropism of AAV2 vectors for human liver. Sci. Transl. Med. 12, eaba3312 (2020).
Chen, Z. et al. Role of humoral immunity against hepatitis B virus core antigen in the pathogenesis of acute liver failure. Proc. Natl Acad. Sci. USA 115, E11369–E11378 (2018).
Wang, D. et al. A deep proteome and transcriptome abundance atlas of 29 healthy human tissues. Mol. Syst. Biol. 15, e8503 (2019).
Niu, L. et al. Dynamic human liver proteome atlas reveals functional insights into disease pathways. Mol. Syst. Biol. 18, e10947 (2022).
Timpe, J. M., Verrill, K. C. & Trempe, J. P. Effects of adeno-associated virus on adenovirus replication and gene expression during coinfection. J. Virol. 80, 7807–7815 (2006).
Yakobson, B., Koch, T. & Winocour, E. Replication of adeno-associated virus in synchronized cells without the addition of a helper virus. J. Virol. 61, 972–981 (1987).
la Bella, T. et al. Adeno-associated virus in the liver: natural history and consequences in tumour development. Gut 69, 737–747 (2020).
Chowdary, P. et al. Phase 1–2 trial of AAVS3 gene therapy in patients with hemophilia B. N. Engl. J. Med. 387, 237–247 (2022).
Chand, D. et al. Hepatotoxicity following administration of onasemnogene abeparvovec (AVXS-101) for the treatment of spinal muscular atrophy. J. Hepatol. 74, 560–566 (2021).
Mullard, A. Gene therapy community grapples with toxicity issues, as pipeline matures. Nat. Rev. Drug Discov. 20, 804–805 (2021).
Morales, L., Gambhir, Y., Bennett, J. & Stedman, H. H. Broader implications of progressive liver dysfunction and lethal sepsis in two boys following systemic high-dose AAV. Mol. Ther. 28, 1753–1755 (2020).
Hüser, D. et al. Integration preferences of wildtype AAV-2 for consensus rep-binding sites at numerous loci in the human genome. PLoS Pathog. 6, e1000985 (2010).
Nault, J. C. et al. Recurrent AAV2-related insertional mutagenesis in human hepatocellular carcinomas. Nat. Genet. 47, 1187–1193 (2015).
Dalwadi, D. A. et al. AAV integration in human hepatocytes. Mol. Ther. 29, 2898–2909 (2021).
Butterfield, J. S. S. et al. IL-15 blockade and rapamycin rescue multifactorial loss of factor VIII from AAV-transduced hepatocytes in hemophilia A mice. Mol. Ther. 30, 3552–3569 (2022).
Nathwani, A. C. Gene therapy for hemophilia. Hematology Am. Soc. Hematol. Educ. Program 2019, 1–8 (2019).
Ho, A. et al. Adeno-associated virus 2 infection in children with non-A–E hepatitis. Nature https://doi.org/10.1038/s41586-023-05948-2 (2023).
Perrin, G. Q., Herzog, R. W. & Markusic, D. M. Update on clinical gene therapy for hemophilia. Blood 133, 407–414 (2019).
Divers, T. J., Tomlinson, J. E. & Tennant, B. C. The history of Theiler’s disease and the search for its aetiology. Vet J. 287, 105878 (2022).
Kelgeri, C. et al. Clinical spectrum of children with acute hepatitis of unknown cause. N. Engl. J. Med. 387, 611–619 (2022).
Sanchez, L. H. G. et al. A case series of children with acute hepatitis and human adenovirus infection. N. Engl. J. Med. 387, 620–630 (2022).
Yang, W. X. et al. Prevalence of serum neutralizing antibodies to adenovirus type 5 (Ad5) and 41 (Ad41) in children is associated with age and sanitary conditions. Vaccine 34, 5579–5586 (2016).
UKHSA. National norovirus and rotavirus bulletin. Week 28 report: data to week 26 (UKHSA, 2022).
Calcedo, R. et al. Adeno-associated virus antibody profiles in newborns, children, and adolescents. Clin. Vaccine Immunol. 18, 1586–1588 (2011).
Nathwani, A. C. et al. Sustained high-level expression of human factor IX (hFIX) after liver-targeted delivery of recombinant adeno-associated virus encoding the hFIX gene in rhesus macaques. Blood 100, 1662–1669 (2002).
Babraham Bioinformatics—Trim Galore! Bioinformatiocs https://www.bioinformatics.babraham.ac.uk/projects/trim_galore// (2019).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Schmieder, R., Edwards, R. & Bateman, A. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011).
Morgulis, A. et al. Database indexing for production MegaBLAST searches. Bioinformatics 24, 1757–1764 (2008).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using diamond. Nat. Methods 12, 59–60 (2015).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Nurk, S., Meleshko, D., Korobeynikov, A. & Pevzner, P. A. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 27, 824–834 (2017).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).
EPI2ME WIMP workflow: quantitative, real-time species identification from metagenomic samples. Oxford Nanopore Technologies https://nanoporetech.com/resource-centre/epi2me-wimp-workflow-quantitative-real-time-species-identification-metagenomic (accessed 2022).
Babraham Bioinformatics. re-DOT-able DotPlot tool. Babraham Bioinformatics https://www.bioinformatics.babraham.ac.uk/projects/redotable/ (accessed 2022).
Babraham Bioinformatics. FastQC: a quality control tool for high throughput sequence data. Babraham Bioinformatics https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed 2022).
Afzal, S., Fronza, R. & Schmidt, M. VSeq-Toolkit: comprehensive computational analysis of viral vectors in gene therapy. Mol. Ther. Methods Clin. Dev. 17, 752–757 (2020).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Aurnhammer, C. et al. Universal real-time PCR for the detection and quantification of adeno-associated virus serotype 2-derived inverted terminal repeat sequences. Hum. Gene Ther. Methods 23, 18–28 (2011).
Watzinger, F. et al. Real-time quantitative PCR assays for detection and monitoring of pathogenic human viruses in immunosuppressed pediatric patients. J. Clin. Microbiol. 42, 5189–5198 (2004).
Tann, C. J. et al. Prevalence of bloodstream pathogens is higher in neonatal encephalopathy cases vs. controls using a novel panel of real-time PCR assays. PLoS ONE 9, e97259 (2014).
Brown, J. R., Shah, D. & Breuer, J. Viral gastrointestinal infections and norovirus genotypes in a paediatric UK hospital, 2014–2015. J. Clin. Virol. 84, 1–6 (2016).
Karda, R. et al. Production of lentiviral vectors using novel, enzymatically produced, linear DNA. Gene Ther. 26, 86–92 (2019).
Myers, C. E. et al. Using whole genome sequences to investigate adenovirus outbreaks in a hematopoietic stem cell transplant unit. Front. Microbiol. 12, 667790 (2021).
Gaccioli, F. et al. Fetal inheritance of chromosomally integrated human herpesvirus 6 predisposes the mother to pre-eclampsia. Nat. Microbiol. 5, 901–908 (2020).
Quick, J. et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat. Protoc. 12, 1261–1267 (2017).
Baker, D. J. et al. CoronaHiT: high-throughput sequencing of SARS-CoV-2 genomes. Genome Med. 13, 21 (2021).
Nakamura, T., Yamada, K. D., Tomii, K. & Katoh, K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34, 2490–2492 (2018).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Hsu, H. L. et al. Structural characterization of a novel human adeno-associated virus capsid with neurotropic properties. Nat. Commun. 11, 3279 (2020).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).
Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13, 731–740 (2016).
Cock, P. J. A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).
Ng, J. et al. Gene therapy restores dopamine transporter expression and ameliorates pathology in iPSC and mouse models of infantile parkinsonism. Sci. Transl. Med. 13, eaaw1564 (2021).
Pollara, G. et al. Exaggerated IL-17A activity in human in vivo recall responses discriminates active tuberculosis from latent infection and cured disease. Sci. Transl. Med. 13, eabg7673 (2021).
Chandran, A. et al. Rapid synchronous type 1 IFN and virus-specific T cell responses characterize first wave non-severe SARS-CoV-2 infections. Cell Rep. Med. 3, 100557 (2022).
Zhang, X. et al. Proteome-wide identification of ubiquitin interactions using UbIA-MS. Nat. Protoc. 13, 530–550 (2018).
UKHSA funded the metagenomics and HAdV sequencing. We thank A. Nathwani for helpful discussions. We acknowledge the considerable contribution from the GOSH microbiology laboratory. We thank the medical students who contributed to the DIAMOND consortium. All research at GOSH and UCL GOSH Institute of Child Health is made possible by the NIHR GOSH Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, the National Institute for Health Research (NIHR), the UKRI or the Department of Health and Social Care. The work was part funded by the NIHR Blood and Transplant Research Unit in Genomics to Enhance Microbiology Screening (GEMS), the National Institute for Health and Care Research (CO-CIN-01) or jointly by NIHR and UK Research and Innovation (CV220-169, MC_PC_19059). S. Morfopoulou is funded by a W.T. Henry Wellcome fellowship (206478/Z/17/Z). S.B. and O.E.T.M. are funded by the NIHR Blood and Transplant Research Unit (GEMS). M.M.M. and M.L. are supported in part by the NIHR Biomedical Research Centre of Imperial College NHS Trust. J.B. receives NIHR Senior Investigator Funding. M.N. and J.B. are supported by the Wellcome Trust (207511/Z/17/Z and 203268/Z/16/Z). M.N., J.B. and G.P. are supported by the NIHR University College London Hospitals Biomedical Research Centre. P. Simmonds is supported by the NIHR (NIHR203338). T.S.J. is grateful for funding from the Brain Tumour Charity, Children with Cancer UK, GOSH Children’s Charity, Olivia Hodson Cancer Fund, Cancer Research UK and the NIHR. DIAMONDS is funded by the European Union (Horizon 2020; grant 848196). PERFORM was funded by the European Union (Horizon 2020; grant 668303).
.J.B. is a member of the MHRA COVID Vaccines and Therapeutics committees; holder of Wellcome Trust, UKRI and NIHR funding; and principal investigator on the GSK LUNAR study to provide MHRA with data on SARS-CoV-2 sequences in patients treated with sotrovimab.
Peer review information
Nature thanks Frank Tacke and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Evidence of AAV2 replication from meta-transcriptomics and RT-PCR.
Mapping of AAV2 reads to the reference genome for a liver RNA-Seq from 4 cases, b blood RNA-Seq from 2 cases. The horizontal lines in the same colour as the coverage graph are the predicted transcripts for each case. The horizontal lines in purple and green are the AAV2 genes. c, RT-PCR results for liver cases. N: Negative PCR result.
Extended Data Fig. 2 Examples of AAV2 complexes.
The y axis shows the coordinates of a full length AAV2 genome (rep gene in green and cap gene in yellow). X axis is the nanopore read with the length of the read indicated. Red dots indicate alignment to the forward strand and blue dots the reverse. a, indicative complexes based on literature8 b and c. Examples of complex structures with both head to tail and alternating repeats, from a total of n = 25 and n = 75 such reads for cases 3 and 5 respectively. b shows the longest 2 reads for each case. d. Examples of truncated monomeric structures, from a total of n = 27 and n = 103 such reads for cases 3 and 5 respectively (Supplementary Table 3). The longest such read for each case is shown.
Extended Data Fig. 3 HAdV and AAV2 sequence analysis.
a, HAdV SNP plot: Visualisation of the multiple alignment of HAdV-F41 genomic sequences from the same clade as the single sequence from a case (highlighted in grey) (Fig. 3a). Includes both contemporary controls and publicly available HAdV-F41 genomes from GenBank. Consensus-level mutations differing from the reference sequence (bottom) are highlighted across the genome. Genomic position of the mutation is shown at the top of the plot. b, Variants between stool complete HAdV genome from case JBB27 and combined blood partial genomes from other cases. c, Frequency table of capsid residues in cases and historical controls. There is no difference between the capsid sequences of cases and contemporaneously circulating controls. However, there are changes compared with historical controls in all contemporary sequences. None of the recently acquired capsid changes are shared with known hepatotrophic strains in AAV7, 8 and 9. d, Amino acid differences between AAV2 capsid sequences from cases, contemporaneously circulating controls and historical publicly available sequences compared with the AVV2 reference sequence NC_001401.2. Also shown are the capsid sequences from known AAV7, 8 and 9 hepatotropic capsids compared to the reference sequence NC_001401.2.
Extended Data Fig. 4 AAV2 capsid analysis.
a, Amino acid sequence of novel AAV capsid variant. The consensus sequence of the VP1 sequence used for investigation of capsid transduction characteristics (AAVHepcase) is shown with alignment to canonical AAV2 VP1 (AAV2gp05). The alignment shows AAV2 amino acids that are different to the AAVHepcase sequence, with dots indicating matched amino acids sequence. b, In vitro analysis of AAV capsid transduction characteristics. Huh-7 hepatocytes were treated at MOI 100,000 with rAAV vectors containing capsid sequences derived from canonical AAV2, a consensus sequence derived from patient sequencing samples (Hepcase), LK03, or AAV9 (n = 3 each treatment). Transduction efficiency was determined by flow cytometry, based on the percentage of EGFP-positive cells, the EGFP fluorescence intensity in positive cells, and the ‘relative activity’ of EGFP expression (calculated by multiplying %GFP-positive cells by MFI/10070). Transductions were performed in the presence or absence of 400 µg/mL heparin to investigate the role of HSPG interaction. rAAV2 was significantly affected by heparin competition, whereas other capsids, including that derived from AAV Hepcase, were not. Heparin competition significantly affected rAAV2 transduction in terms of percentage of GFP-positive cells (P = 0.0016), MFI (P = 0.000008), and relative activity (P = 0.000008), whereas other capsids, including that derived from AAV Hepcase, were not affected by heparin. All data were analysed by 2-sided t-test with Bonferroni post-hoc analysis. Error bars indicate standard deviation from the mean value. c, Images of Huh-7 cells treated with rAAV vectors in vitro. Images of transduced Huh-7 cells. Each cell population was treated with MOI 100,000 of the relevant viral vector, in the presence or absence of 400 µg/mL heparin and analysed by EGFP fluorescence 72-hours post-transduction. Scale bars = 300 µm.
Extended Data Fig. 5 Representative histology of case livers.
a & b, H&E sections x100 and x200 showing a pattern of acute hepatitis with parenchymal disarray, there is a normal, uninflamed, portal tract lower left image a. Spotty inflammation and apoptotic bodies are shown in b along with perivenular hepatocyte loss/necrosis. Immunohistochemistry shows fewer mature B lymphocytes (CD20 panel c) than T lymphocytes (CD3, panel d, pan T cell marker) most of which are cytotoxic CD8 lymphocytes (panel e). In conclusion the livers of these children have a distinctive pattern of damage which does not indicate a specific aetiology, it does not exclude but does not offer positive support for either autoimmune hepatitis or a direct cytopathic effect of virus on hepatocytes. Each image shows a representative result from histology carried out on a minimum of five cases.
Extended Data Fig. 6 Immunohistochemistry results for cases of unexplained hepatitis and control tissues.
a, Inflammatory markers (IgG, C4d, HLA-ABC, HLA-DR) in acute hepatitis cases and control liver. IgG, HLA-ABC and HLA-DR show a canalicular pattern in the control liver. This pattern is disrupted in the acute hepatitis cases due to the architectural collapse. In addition, there is increased staining associated with inflammatory cell/macrophage infiltrates. C4d shows very weak staining in the acute hepatitis cases associated with macrophages but with without endothelial staining. All stains were undertaken on 5 affected cases and 13 control cases. b, Representative images of the immunohistochemistry (IHC). Acute hepatitis liver explant cases stained for HHV6, arrow shows staining of A representative cells, B adenovirus, AAV2 (C polyclonal antibody, E monoclonal antibody, clone A1). Paraffin embedded AAV2 transfected cell lines stained as positive controls for AAV2 (D polyclonal antibody, F monoclonal antibody, clone A1). All scale bars are 60 micrometres. HHV6, AAV2 (polyclonal) stains were undertaken on 15 affected cases and 13 controls. AAV2 (A1) stains were undertaken on 5 affected cases and 13 control cases. Staining for adenovirus was undertaken on 5 affected cases.
Extended Data Fig. 7 Cytokine inducible transcriptional modules.
Volcano plot of cytokine inducible transcriptional modules (n = 52) comparing their Z score expression in AAV2-associated hepatitis (n = 4) and HBV-associated hepatitis (n = 17) requiring transplantation using two-tailed unpaired t tests with Holm Sidak multiple testing correction for adjusted p values (n refers to number of patients). Each point represents a specific module listed in full in Supplementary Table 13. Labels for selected modules are shown.
Extended Data Fig. 8 HLA and HHV-6B proteins in case livers.
a & b Ranking of the quantified proteins using the log10 of iBAQ values for a JBL1, b JBL2, c JBL3, d JBL4, e JBL5. f, Scatter plot of quantified proteins in sample JBL4 versus JBL5. HLA proteins are highlighted in red. Red arrows denote HLA-DRB1 proteins. HHV6 proteins are highlighted in green and marked with green arrows.
Supplementary Table 1
Clinical details of cases. 12 transplanted cases, 26 non-transplanted. Median age of cases where exact age is known is 3 years, with range 1.5-9. Case 10 was 9 years old. All other cases were aged 7 or under. Of 22 cases where the gender is known, 12 cases were female and 10 were male. Where known, all cases were of white ethnicity other than two.
Supplementary Table 2
Metagenomics summary statistics: raw read counts, human filtered and other findings. F: sequencing failed.
Supplementary Table 3
Nanopore sequencing. All four samples were sequenced to a lower depth. Case 3 and 5 underwent a second round of deeper sequencing. N50s across all sequencing runs were similar.
Supplementary Table 4
Clinical details for PERFORM/DIAMONDS immunocompetent controls and microbiological testing by referring laboratory for DIAMONDS controls. P: positive, N: negative, IC: inconclusive, VL: viral load.
Supplementary Table 5
PCR Results. Not all samples were tested for all viruses due to lack of remaining material. LLP: low level positive (Ct value > 38 and < 45), ND: not determined (negative PCR results), NA: not tested due to lack of material.
Supplementary Table 6
Clinical details of liver controls and comparators
Supplementary Table 7
HAdV whole genome sequencing. OTR: on target reads, MRD: mean read depth, Coverage 1X: percentage of genome covered at depth 1, Coverage 30X: percentage of genome covered at depth 30.
Supplementary Table 8
Mapping of HAdV partial sequences.
Supplementary Table 9
AAV2 whole genome sequencing: OTR: on target reads, MRD: mean read depth, Coverage 1X: percentage of genome covered at depth 1, Coverage 10X: percentage of genome covered at depth 10.
Supplementary Table 10
HHV-6B whole genome sequencing from liver case samples. OTR: on target reads, MRD: mean read depth, Coverage 1X: percentage of genome covered at depth 1, Coverage 10X: percentage of genome covered at depth 10.
Supplementary Table 11
Cytokine modules - cytokine-inducible transcriptional signatures of cell-mediated immunity.
Supplementary Table 12
Summary statistics for multiple comparisons of cytokine transcriptional modules. Two-tailed unpaired t-tests with Holm Sidak multiple testing correction for adjusted p values were performed.
Supplementary Table 13
List of differentially expressed proteins between 5 cases and 7 controls. The p-values were calculated by applying two-tailed empirical Bayes moderated t-statistics on protein-wise linear models. P values were not adjusted for multiple comparisons.
Supplementary Table 14
List of differentially expressed peptides between 5 cases and 7 controls. The p-values were calculated by applying two-tailed empirical Bayes moderated t-statistics on peptide-wise linear models. P values were not adjusted for multiple comparisons.
Supplementary Table 15
HLA allele frequency of cases.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Morfopoulou, S., Buddle, S., Torres Montaguth, O.E. et al. Genomic investigations of unexplained acute hepatitis in children. Nature 617, 564–573 (2023). https://doi.org/10.1038/s41586-023-06003-w
This article is cited by
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.