Diagnosis of cytomegalovirus infection from clinical whole genome sequencing


Rapid whole genome sequencing (rWGS) of peripheral blood has been used to detect microbial DNA in acute infections. Cytomegalovirus (CMV) is a herpesvirus capable of causing severe disease in neonates and immunocompromised patients. We identified CMV in patients undergoing diagnostic rWGS by matching reads that did not align to the human reference genome to a database of microbial genomes. rWGS was conducted on peripheral blood obtained from ill pediatric patients (age 1 day to 18 years). Reads not aligning to the human genome were analyzed using an in-house pipeline to identify DNA consistent with CMV infection. Of 669 patients who received rWGS from July 2016 through July 2019, we identified 28 patients (4.2%) with reads that aligned to the CMV reference genome. Six of these patients had clinical findings consistent with symptomatic CMV infection. Positive results were highly correlated (R2 > 0.99, p < 0.001) to a CMV-qPCR assay conducted on DNA isolated from whole blood samples. In acutely ill children receiving rWGS for diagnosis of genetic disease, we propose analysis of patient genetic data to identify CMV, which could impact treatment of up to 4% of children in the intensive care unit.


Rapid whole genome sequencing (rWGS) of peripheral blood can rapidly diagnose genetic diseases and impact clinical care within 24 h1,2. A recent study showed a diagnosis rate of 43% in critically ill patients with a resultant change in management in 33% of patients1. rWGS allows clinicians to make actionable diagnoses in critically ill patients who may have an underpinning genetic or metabolic disease and is increasingly becoming a part of standard care. The concept of applying rWGS to detect infectious pathogens has also been utilized3,4,5,6. While plasma polymerase chain reaction (PCR) is often used for detection of infectious organisms, we theorized that an advantage of sequencing whole blood is that organisms that reside intracellularly within white blood cells are included in rWGS reads. Cytomegalovirus (CMV) is human herpesvirus 5 (HHV-5) and capable of causing disease in critically ill patients7. It can be found both intracellularly and extracellularly in humans. CMV can cause hepatitis, marrow suppression (including thrombocytopenia), encephalitis, retinitis, and pneumonitis. In neonates, CMV may also manifest as growth restriction, microcephaly, neurologic defects (including sensorineural hearing loss), and intellectual disability. Prevalence of congenital CMV ranges from 0.2 to 2.0% of pregnancies. Although most infected infants will have only latent infection without any sequelae, up to 12%—representing approximately 9,600 children per year in the United States—will experience neurological sequelae related to congenital CMV infection8. Recent studies have demonstrated improved outcomes in neonates with congenitally acquired CMV infection if antiviral treatment is started within the first 30 days of life9. Identifying CMV infection in critically ill children who are undergoing rWGS has the potential to significantly influence their outcomes. We sought to retrospectively detect the presence of CMV in patients undergoing rWGS and to identify those in whom CMV may have contributed to the severity of their illness.


Peripheral blood was obtained from acutely ill pediatric patients (age 1 day to 18 years) after parental consent and patient assent, where indicated, were obtained. Informed consent was obtained from all subjects or, if subjects were under 18 years, from a parent and/or legal guardian, including permission for publication. Research was conducted according to the principles of the Declaration of Helsinki and as approved by the University of California San Diego Institutional Review Board. All experimental protocols were approved by the University of California San Diego Institutional Review Board. Genomic DNA was extracted, paired-end genomic libraries were prepared without PCR, and sequenced (2 × 100 bp) on Illumina (San Diego, CA, USA) HiSeq 2,500, HiSeq 4,000 or NovaSeq 6,000 instruments, as previously described, to a whole genome redundant depth of ~ 40-fold1. Reads were aligned to the human genome (hg19) as previously described1. Unaligned reads were analyzed using an in-house built pipeline. Reads were first extracted from the whole genome binary alignment map (BAM) and realigned to the human genome with Burrows-Wheeler Aligner-Maximal Exact Match (BWA-MEM) using less stringent parameters (− c 10,000) to allow repetitive reads derived from the human genome to be aligned10. Reads with low quality (> 1/8th of the read length with a phred-like score of 1) or low complexity (> 25% of the read has the same dinucleotide) were removed. The remaining reads were aligned to a database of viral and bacterial genomes retrieved from the Microbial Genome Database (https://mbgd.genome.ad.jp/, accessed May 2017) and National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/genome/viruses/) (accessed Apr 2017)11. Duplicate alignments were then marked and removed using Sambamba (https://github.com/biod/sambamba)12. Only reads of at least 100 consecutive nucleotides were counted for each microbial species in each patient, this value was chosen to be stricter than the suggested metagenomic sequencing guideline (75 bp) to ensure accuracy in a clinical setting13. Read counts were normalized based on total rWGS coverage. We abstracted electronic health record data of all patients who had rWGS performed at Rady Children’s Institute for Genomic Medicine (RCIGM) between July 2016 and July 2019. In patients for whom charts were available in the electronic medical record, a manual review was performed by a pediatric infectious disease fellow (NR) to determine if the patients had clinical features consistent with CMV infection and/or positive clinical plasma CMV PCR results (DiaSorin Molecular CMV Primer Pair and the LIAISON MDX Thermocycler, Cypress, CA, USA). In false negative patients who received rWGS testing and were positive for CMV by plasma CMV PCR, but were negative by realignment of unaligned rWGS reads, we performed CMV PCR on the sample used for rWGS. We also performed CMV PCR testing on the saved DNA specimens used for rWGS testing on the patients in whom two or more reads of CMV were detected. Each qPCR run was performed with a high titer and low titer external positive control (Exact Diagnostics, Fort Worth, TX) in addition to a negative control. We performed Sanger sequencing PCR reactions as earlier described using previously validated primers on these DNA specimens to exclude the presence of a PCR inhibitor1.

Ethics approval

Peripheral blood was obtained from ill pediatric patients (age 1 day to 18 years) after parental consent and patient assent or consent where possible according to University of California San Diego Institutional Review Board. Research was conducted according to the principles of the Declaration of Helsinki.

Consent for publication

All patients were consented. When possible patient assent was obtained as well.


From July 2016 through July 2019, 669 patients received rWGS at RCIGM. Of these, 164 were found to have at least one read that did not align to the reference human genome and that completely aligned to a human herpesvirus (HHV) genome: one subject with reads aligning to HHV-4 (Epstein-Barr virus, NC_009334.1), 28 with reads aligning to HHV-5 (CMV, NC_006273.2), 69 with HHV-6 (NC_001664.2 or NC_000898.1), and 66 with HHV-7 (NC_001716.2). The 28 patients in whom we detected two or more reads of HHV-5 (CMV) represented 4.2% of the total population evaluated. Through chart review we identified clinical characteristics suggestive of symptomatic CMV infection in six of these 28 patients (21%) (See Table 1, case summaries are found in the Supplement).

Table 1 Symptomatic patients with CMV identified by rWGS pipeline.

Of these six patients, three had never been evaluated for CMV infection. Only one patient had been started on antiviral treatment. To assess for false-negative results associated with our detection methods, we performed a chart review of all children who received rWGS testing at RCIGM to identify patients found to have CMV infection by standard-of-care testing. Given the difficulty of interpreting serology, we limited our search to patients with positive CMV PCR tests14. We identified five patients who were CMV quantitative PCR (qPCR) positive but not identified by rWGS read re-alignment (Table 1 in Supplementary material). Recognizing that CMV testing and rWGS were often done at different times during the patient’s hospital course, we performed CMV qPCR testing on the same DNA sample (dedicated CMV qPCR) that had been used for rWGS. PCR testing detected CMV in only one of five patients tested. In that case, CMV levels were too low (< 227 IU/mL) to be quantifiable, indicating a very low viral load. We additionally performed a dedicated CMV qPCR on the saved DNA specimen used for rWGS for the 28 patients with two counts or more of reads consistent with CMV (Table 2).

Table 2 Patients with 2 reads or more reads of CMV detected in rWGS. CMV PCR was performed on the same DNA sample used for rWGS testing.

To assess for the presence of a PCR inhibitor, we used Sanger PCR reaction on all 28 samples tested. All of the samples underwent testing with two individual validated oligo sets. Each oligo set included one internal amplification control and one negative control. In 13 of the 14 patients with only two unaligned reads consistent with CMV, PCR reaction did not detect CMV. In a patient with three reads of CMV in WGS data and another with 19 reads in the rWGS data, CMV was not detected by PCR. Above a read count of 19, CMV was consistently detected by PCR reaction and quantifiable. The infectious unit count detected by PCR was highly correlated with the read count from rWGS (R2 = 0.9942; p < 0.0001; see Fig. 1.).

Figure 1

Log2 CMV read count from rWGS versus viral load from CMV-quantitative PCR assay for 8 samples where rWGS detected CMV DNA and quantitative PCR could quantify viral load. rWGS read results are highly correlated to infectious unit determination by gold-standard quantitative PCR assay.


Of 669 subjects who underwent rWGS at RCIGM from July 2016 through July 2019, we identified 28 patients with two or more DNA sequence reads consistent with CMV DNA. We also identified 164 patient genomes with unaligned reads consistent with DNA from any herpesvirus. Chart review identified clinical findings such as hepatitis, seizures, and thrombocytopenia that could be consistent with symptomatic CMV infection in 6 of the 28 in whom the presence of CMV was detected (see Table 1). Moreover, in these 6 case histories, even if the primary diagnosis may overlap with symptomology of CMV infection, CMV may have still have contributed to the symptoms experienced by these patients (and possibly even been the major cause of illness in case number 6). Of these, three patients had not previously been evaluated for CMV, and only one patient had been started on antiviral therapy. Two of the patients (cases 1 and 4) were neonates with possible congenital CMV infection and would have met inclusion criteria for treatment based on Kimberlin et al.9 In cases 2, 3, 5, and 6, though it is not possible to distinguish congenitally versus postnatally acquired infection, all of these patients were relatively immunosuppressed as a result of their primary diagnosis or from their treatment regimen.

In 5 cases with a positive clinical qPCR, we did not identify a CMV infection by WGS. In 4 of these cases, a confirmatory qPCR conducted on the plasma obtained for WGS was negative, likely due to a waning viremia. In the remaining case, CMV was detected, but below the detection limit for quantitation. In one case, a substantial HHV7 viremia was detected, and it may be possible that the initial CMV positive was due to cross reactivity with one of the qPCR primers for HHV7. Although qPCR can be highly specific, it is generally tested in the background of subjects with no viremia rather than a different HHV infection15. Future studies will need to be conducted to test if WGS can also be more sensitive than qPCR in some cases.

The clinical spectrum of CMV is highly varied and overlaps with many other disease processes. The ability of clinicians to identify active CMV viremia clinically can therefore be challenging16. In neonates in particular, even patients with mild disease initially may have significant morbidity, including sensorineural hearing loss17. A land mark study by Kimberlin and colleagues in 2015 demonstrated that morbidity from symptomatic congenital CMV infection could be reduced or averted by early initiation of antiviral therapy9. Recent studies have explored the utility of hearing screening to identify congenital CMV, though with a sensitivity of just 53%, this strategy will miss many affected patients8,18,19. Current diagnostic approaches are inadequate to identify CMV infection in vulnerable patients in a timely manner.

The use of unaligned reads from rWGS testing offers a cost-effective approach to identifying infectious pathogens while simultaneously testing for pathogenic variants in the host genome20. Given that most patients undergoing rWGS are critically ill, it is likely that many of these patients have an infectious process superimposed on an underlying medical condition that led to a severe disease state7,21,22. CMV, which is associated with significant morbidity in neonates and immunodeficient hosts, is a pathogen that may be implicated in many such cases. Analyzing rWGS for microbial DNA enables the testing laboratory to alert clinicians in a timely manner to the presence of this pathogen while simultaneously evaluating the host genome for other pathogenic conditions. The results of this assay are highly correlated to orthogonal testing with CMV-specific qPCR. Moreover, the additional cost of testing to confirm CMV infection following rWGS is low. Whole genome data may also play a role in understanding the natural history of congenital CMV beyond the identification of apparently symptomatic children. This is especially relevant given the calls for population-based whole genome sequencing of neonates and for population-level screening for CMV, which will identify many more children infected with CMV whose clinical course is uncertain16. Additional research using rWGS may help clarify which CMV positive neonates would benefit from potentially toxic antiviral therapy16.

This study has several limitations. We conducted our analysis retrospectively, and the patients we identified were therefore not systematically assessed for CMV infection by the treating team during their hospital stay. Furthermore, even in patients who were evaluated for CMV, the rWGS sample was often not obtained at the same time as the evaluation for CMV, which is significant given that CMV viremia may wane over time or resolve altogether23. Our chart review was also limited by the wide clinical variation in what may be attributable to CMV infection (hepatitis, cytopenia, etc.), and in most of our patients, their underlying metabolic or genetic disease process was also associated with many of the classic laboratory findings associated with CMV infection. PCR testing on the 28 samples from patients in whom two or more reads of CMV were detected by rWGS was done on DNA extracted from whole blood samples rather than on plasma as is traditionally done when CMV viremia is suspected. However, the common practice of obtaining plasma CMV PCR in the clinical setting is not necessarily due to a limitation in detection of CMV from whole blood24,25. Additionally, given the negative qPCR results for the majority of patients with only 2 reads of CMV and the positive correlation between read count and higher PCR quantitation, there is the intriguing possibility that rWGS may be more sensitive than qPCR for CMV, an observation that has been made for other organsims26. Further studies are required to prospectively compare and correlate CMV, and potentially other microbe, read count from rWGS with standard care testing, define the specificity and sensitivity of this approach, and ultimately to demonstrate clinical utility.

rWGS may be a rapid, extremely sensitive method to detect some microbial infections. We propose that, in critically ill children under going rWGS, unaligned reads be used to assess the presence of microbial DNA and that this analysis should be conducted regardless of prior clinical diagnosis. This approach has the potential to identify clinically relevant pathogens such as CMV, which may cause significant disease in certain vulnerable patient populations. Moreover, for CMV infections in particular, early initiation of antivirals may positively modify patient outcomes.


  1. 1.

    Farnaes, L. et al. Rapid whole-genome sequencing decreases infant morbidity and cost of hospitalization. NPJ Genomic Med. 3, 10 (2018).

    Article  Google Scholar 

  2. 2.

    Clark, M. M. et al. Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Sci. Transl. Med. 11, eaat6177 (2019).

    Article  Google Scholar 

  3. 3.

    Blauwkamp, T. A. et al. Analytical and clinical validation of a microbial cell-free DNA sequencing test for infectious disease. Nat. Microbiol. 4, 663 (2019).

    CAS  Article  Google Scholar 

  4. 4.

    Schlaberg, R. et al. Viral pathogen detection by metagenomics and Pan-viral group polymerase chain reaction in children with pneumonia lacking identifiable etiology. J. Infect. Dis. 215, 1407–1415 (2017).

    CAS  Article  Google Scholar 

  5. 5.

    Thoendel, M. J. et al. Identification of prosthetic joint infection pathogens using a shotgun metagenomics approach. Clin. Infect. Dis. 67, 1333–1338 (2018).

    CAS  Article  Google Scholar 

  6. 6.

    Leggett, R. M. et al. Rapid MinION profiling of preterm microbiota and antimicrobial-resistant pathogens. Nat. Microbiol. 5, 1–13. https://doi.org/10.1038/s41564-019-0626-z (2019).

    CAS  Article  Google Scholar 

  7. 7.

    Davila, S. et al. Viral DNAemia and immune suppression in pediatric sepsis. Pediatr. Crit. Care Med J. Soc. Crit. Care Med. World Fed. Pediatr. Intensive Crit. Care Soc. 19, e14–e22 (2018).

    Google Scholar 

  8. 8.

    Ludwig, A. & Hengel, H. Epidemiological impact and disease burden of congenital cytomegalovirus infection in Europe. Eurosurveillance 14, 19140 (2009).

    PubMed  Google Scholar 

  9. 9.

    Kimberlin, D. W. et al. Valganciclovir for symptomatic congenital cytomegalovirus disease. N. Engl. J. Med. 372, 933–943 (2015).

    CAS  Article  Google Scholar 

  10. 10.

    Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Q-Bio, ArXiv arXiv:1303.3997 (2013).

  11. 11.

    Uchiyama, I. MBGD: microbial genome database for comparative analysis. Nucleic Acids Res. 31, 58–62 (2003).

    CAS  Article  Google Scholar 

  12. 12.

    Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: fast processing of NGS alignment formats. Bioinformatics 31, 2032 (2015).

    CAS  Article  Google Scholar 

  13. 13.

    Bokulich, N. A. et al. Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing. Nat. Methods 10, 57–59 (2013).

    CAS  Article  Google Scholar 

  14. 14.

    Boppana, S. B. et al. Saliva polymerase-chain-reaction assay for cytomegalovirus screening in newborns. N. Engl. J. Med. 364, 2111–2118 (2011).

    CAS  Article  Google Scholar 

  15. 15.

    Ross, S. A., Novak, Z., Pati, S. & Boppana, S. B. Diagnosis of cytomegalovirus infections. Infect. Disord. Drug Targets 11, 466–474 (2011).

    CAS  Article  Google Scholar 

  16. 16.

    Rawlinson, W. D. et al. Congenital cytomegalovirus infection in pregnancy and the neonate: consensus recommendations for prevention, diagnosis, and therapy. Lancet Infect. Dis. 17, e177–e188 (2017).

    Article  Google Scholar 

  17. 17.

    Lanzieri, T. M. et al. Hearing loss in children with asymptomatic congenital cytomegalovirus infection. Pediatrics 139, e20162610 (2017).

    Article  Google Scholar 

  18. 18.

    Fowler, K. B. et al. A targeted approach for congenital cytomegalovirus screening within newborn hearing screening. Pediatrics 139, e20162128 (2017).

    Article  Google Scholar 

  19. 19.

    Park, A. H. et al. A diagnostic paradigm including cytomegalovirus testing for idiopathic pediatric sensorineural hearing loss. The Laryngoscope 124, 2624–2629 (2014).

    Article  Google Scholar 

  20. 20.

    Sanford, E. et al. Concomitant diagnosis of immune deficiency and Pseudomonas sepsis in a 19 month old with ecthyma gangrenosum by host whole-genome sequencing. Mol. Case Stud. 4, a003244 (2018).

    CAS  Article  Google Scholar 

  21. 21.

    Farnaes, L., Coufal, N. G. & Spector, S. A. Vaccine strain varicella infection in an infant with previously undiagnosed perinatal human immunodeficiency type-1 infection. Pediatr. Infect. Dis. J. 38, 413–415 (2019).

    Article  Google Scholar 

  22. 22.

    Walton, A. H. et al. Reactivation of multiple viruses in patients with sepsis. PLoS ONE 9, e98819 (2014).

    Article  ADS  Google Scholar 

  23. 23.

    Ng, A. P. et al. Cytomegalovirus DNAemia and disease: incidence, natural history and management in settings other than allogeneic stem cell transplantation. Haematologica 90, 1672–1679 (2005).

    CAS  PubMed  Google Scholar 

  24. 24.

    Lisboa, L. F. et al. The clinical utility of whole blood versus plasma cytomegalovirus viral load assays for monitoring therapeutic response. Transplantation 91, 231–236 (2011).

    Article  Google Scholar 

  25. 25.

    Chen, X. Y., Hou, P. F., Bi, J. & Ying, C. M. Detection of human cytomegalovirus DNA in various blood components after liver transplantation. Braz. J. Med. Biol. Res. 47, 340–344 (2014).

    CAS  Article  Google Scholar 

  26. 26.

    Andersen, S. C., Fachmann, M. S. R., Kiil, K., Møller Nielsen, E. & Hoorfar, J. Gene-based pathogen detection: can we use qPCR to predict the outcome of diagnostic metagenomics?. Genes 8, 332 (2017).

    Article  Google Scholar 

Download references


Medical editing services were provided by Kevin Lomangino and Karen Doyle of KWF Consulting, Baltimore, MD, USA. This study was supported in part by grant U19HD077693 from NICHD and NHGRI to S.F.K., and gifts from the Liguori Family, John Motter and Effie Simanikas, Ernest and Evelyn Rady, and Rady Children’s Hospital San Diego. All authors had full access to all data, and MB had final responsibility for the decision to submit for publication.

Author information




Authors M.B., L.F., D.D., C.H., S.K., and N.R. conceived the study design. N.R. performed the chart review and drafted the initial manuscript. Author Y.D. performed sample processing. Authors M.B., L.F., D.D., S.K., C.H., and Y.D. reviewed and contributed to the manuscript.

Corresponding author

Correspondence to Matthew Bainbridge.

Ethics declarations

Competing interests

All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no financial relationships with any organizations that might have an interest in the submitted work in the previous three years and no other relationships or activities that could appear to have influenced the submitted work.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ramchandar, N., Ding, Y., Farnaes, L. et al. Diagnosis of cytomegalovirus infection from clinical whole genome sequencing. Sci Rep 10, 11020 (2020). https://doi.org/10.1038/s41598-020-67656-5

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.