Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome sequencing analysis identifies Epstein–Barr virus subtypes associated with high risk of nasopharyngeal carcinoma

Abstract

Epstein–Barr virus (EBV) infection is ubiquitous worldwide and is associated with multiple cancers, including nasopharyngeal carcinoma (NPC). The importance of EBV viral genomic variation in NPC development and its striking epidemic in southern China has been poorly explored. Through large-scale genome sequencing of 270 EBV isolates and two-stage association study of EBV isolates from China, we identify two non-synonymous EBV variants within BALF2 that are strongly associated with the risk of NPC (odds ratio (OR) = 8.69, P = 9.69 × 10−25 for SNP 162476_C; OR = 6.14, P = 2.40 × 10−32 for SNP 163364_T). The cumulative effects of these variants contribute to 83% of the overall risk of NPC in southern China. Phylogenetic analysis of the risk variants reveals a unique origin in Asia, followed by clonal expansion in NPC-endemic regions. Our results provide novel insights into the NPC endemic in southern China and also enable the identification of high-risk individuals for NPC prevention.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Principal component and phylogenetic analyses of EBV genomes.
Fig. 2: Genome-wide association analysis of EBV variants in 156 NPC cases and 47 controls.

Data availability

The EBV sequencing data are deposited in the US National Center for Biotechnology Information (NCBI) database under BioProject ID PRJNA522388. EBV sequences are released in NCBI database under GenBank IDs MK540241MK540470.

References

  1. Epstein, M. A., Achong, B. G. & Barr, Y. M. Virus particles in cultured lymphoblasts from Burkitt’s lymphoma. Lancet 1, 702–703 (1964).

    CAS  Article  Google Scholar 

  2. Epstein, A. Why and how Epstein-Barr virus was discovered 50 years ago. Curr. Top. Microbiol Immunol. 390, 3–15 (2015).

    CAS  PubMed  Google Scholar 

  3. Kieff, E. D. & Rickinson, A. B. in Fields’ Virology 5th edn, Vol. 2 (eds Knipe, D. M. & Howley, P. M.) Ch. 68A, 2603–2654 (Lippincott Williams & Wilkins, Wolters Kluwer, 2007).

  4. Zhang, L. F. et al. Incidence trend of nasopharyngeal carcinoma from 1987 to 2011 in Sihui county, Guangdong province, south China: an age-period-cohort analysis. Chin. J. Cancer 34, 350–357 (2015).

    CAS  PubMed  Google Scholar 

  5. Bei, J. X. et al. A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci. Nat. Genet 42, 599–603 (2010).

    CAS  Article  Google Scholar 

  6. Bei, J. X. et al. A GWAS Meta-analysis and replication study identifies a novel locus within CLPTM1L/TERT associated with nasopharyngeal carcinoma in individuals of Chinese ancestry. Cancer Epidemiol. Biomark. Prev. 25, 188–192 (2016).

    CAS  Article  Google Scholar 

  7. Cui, Q. et al. An extended genome-wide association study identifies novel susceptibility loci for nasopharyngeal carcinoma. Hum. Mol. Genet. 25, 3626–3634 (2016).

    CAS  Article  Google Scholar 

  8. Tang, M. et al. The principal genetic determinants for nasopharyngeal carcinoma in China involve the HLA class I antigen recognition groove. PLoS Genet. 8, e1003103 (2012).

    CAS  Article  Google Scholar 

  9. Baer, R. et al. DNA sequence and expression of the B95-8 Epstein–Barr virus genome. Nature 310, 207–211 (1984).

    CAS  Article  Google Scholar 

  10. Zeng, M. S. et al. Genomic sequence analysis of Epstein-Barr virus strain GD1 from a nasopharyngeal carcinoma patient. J. Virol. 79, 15323–15330 (2005).

    CAS  Article  Google Scholar 

  11. Dolan, A., Addison, C., Gatherer, D., Davison, A. J. & McGeoch, D. J. The genome of Epstein–Barr virus type 2 strain AG876. Virology 350, 164–170 (2006).

    CAS  Article  Google Scholar 

  12. Liu, P. et al. Direct sequencing and characterization of a clinical isolate of Epstein-Barr virus from nasopharyngeal carcinoma tissue by using next-generation sequencing technology. J. Virol. 85, 11291–11299 (2011).

    Article  Google Scholar 

  13. Lin, Z. et al. Whole-genome sequencing of the Akata and Mutu Epstein-Barr virus strains. J. Virol. 87, 1172–1182 (2013).

    CAS  Article  Google Scholar 

  14. Palser, A. L. et al. Genome diversity of Epstein-Barr virus from multiple tumor types and normal infection. J. Virol. 89, 5222–5237 (2015).

    CAS  Article  Google Scholar 

  15. Correia, S. et al. Natural Variation of Epstein-Barr Virus Genes, Proteins, and Primary MicroRNA. J. Virol. 91, e00375-17 (2017).

    Article  Google Scholar 

  16. Kwok, H. et al. Genomic diversity of Epstein-Barr virus genomes isolated from primary nasopharyngeal carcinoma biopsy samples. J. Virol. 88, 10662–10672 (2014).

    CAS  Article  Google Scholar 

  17. Edwards, R. H., Seillier-Moiseiwitsch, F. & Raab-Traub, N. Signature amino acid changes in latent membrane protein 1 distinguish Epstein–Barr virus strains. Virology 261, 79–95 (1999).

    CAS  Article  Google Scholar 

  18. Hui, K. F. et al. High risk Epstein–Barr virus variants characterized by distinct polymorphisms in the EBER locus are strongly associated with nasopharyngeal carcinoma. Int. J. Cancer 144, 3031–3042 (2018).

    Article  Google Scholar 

  19. Tso, K. K. et al. Complete genomic sequence of Epstein–Barr virus in nasopharyngeal carcinoma cell line C666-1. Infect. Agent Cancer 8, 29 (2013).

    Article  Google Scholar 

  20. Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  21. Chen, H. et al. Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. Am. J. Hum. Genet. 98, 653–666 (2016).

    CAS  Article  Google Scholar 

  22. Guan, Y. & Stephens, M. Bayesian variable selection regression for genome-wide association studies and other large-scale problems. Ann. Appl. Stat. 5, 1780–1815 (2011).

    Article  Google Scholar 

  23. Decaussin, G., Leclerc, V. & Ooka, T. The lytic cycle of Epstein–Barr virus in the nonproducer Raji line can be rescued by the expression of a 135-kilodalton protein encoded by the BALF2 open reading frame. J. Virol. 69, 7309–7314 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Zeng, Y., Middeldorp, J., Madjar, J. J. & Ooka, T. A major DNA binding protein encoded by BALF2 open reading frame of Epstein–Barr virus (EBV) forms a complex with other EBV DNA-binding proteins: DNAase, EA-D, and DNA polymerase. Virology 239, 285–295 (1997).

    CAS  Article  Google Scholar 

  25. Mumtsidu, E. et al. Structural features of the single-stranded DNA-binding protein of Epstein–Barr virus. J. Struct. Biol. 161, 172–187 (2008).

    CAS  Article  Google Scholar 

  26. Rowe, M. et al. Distinction between Epstein-Barr virus type A (EBNA 2A) and type B (EBNA 2B) isolates extends to the EBNA 3 family of nuclear proteins. J. Virol. 63, 1031–1039 (1989).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Li, D. J. et al. The dominance of China 1 in the spectrum of Epstein–Barr virus strains from Cantonese patients with nasopharyngeal carcinoma. J. Med. Virol. 81, 1253–1260 (2009).

    CAS  Article  Google Scholar 

  28. Coghill, A. E. et al. Identification of a Novel, EBV-based antibody risk stratification signature for early detection of nasopharyngeal carcinoma in Taiwan. Clin. Cancer Res. 24, 1305–1314 (2018).

    CAS  Article  Google Scholar 

  29. Paramita, D. K. et al. Native early antigen of Epstein–Barr virus, a promising antigen for diagnosis of nasopharyngeal carcinoma. J. Med. Virol. 79, 1710–1721 (2007).

    CAS  Article  Google Scholar 

  30. Steven, N. M. et al. Immediate early and early lytic cycle proteins are frequent targets of the Epstein–Barr virus-induced cytotoxic T cell response. J. Exp. Med. 185, 1605–1617 (1997).

    CAS  Article  Google Scholar 

  31. Xue, W. Q. et al. Decreased oral Epstein-Barr virus DNA loads in patients with nasopharyngeal carcinoma in Southern China: a case-control and a family-based study. Cancer Med. 7, 3453–3464 (2018).

    CAS  Article  Google Scholar 

  32. Hadinoto, V., Shapiro, M., Sun, C. C. & Thorley-Lawson, D. A. The dynamics of EBV shedding implicate a central role for epithelial cells in amplifying viral output. PLoS Pathog. 5, e1000496 (2009).

    Article  Google Scholar 

  33. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  Article  Google Scholar 

  34. Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  35. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    CAS  Article  Google Scholar 

  36. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6, 80–92 (2012).

    CAS  Article  Google Scholar 

  37. Raab-Traub, N. & Flynn, K. The structure of the termini of the Epstein–Barr virus as a marker of clonal cellular proliferation. Cell 47, 883–889 (1986).

    CAS  Article  Google Scholar 

  38. Pathmanathan, R., Prasad, U., Sadler, R., Flynn, K. & Raab-Traub, N. Clonal proliferations of cells infected with Epstein–Barr virus in preinvasive lesions related to nasopharyngeal carcinoma. N. Engl. J. Med. 333, 693–698 (1995).

    CAS  Article  Google Scholar 

  39. Neri, A. et al. Epstein–Barr virus infection precedes clonal expansion in Burkitt’s and acquired immunodeficiency syndrome-associated lymphoma. Blood 77, 1092–1095 (1991).

    CAS  PubMed  Google Scholar 

  40. Weiss, E. R. et al. Early Epstein-Barr virus genomic diversity and convergence toward the B95.8 Genome in primary infection. J. Virol. 92, e01466-17 (2018).

    Article  Google Scholar 

  41. Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

    CAS  Article  Google Scholar 

  42. Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002).

    CAS  Article  Google Scholar 

  43. Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).

    CAS  Article  Google Scholar 

  44. Berger, S. A., Krompass, D. & Stamatakis, A. Performance, accuracy, and Web server for evolutionary placement of short sequence reads under maximum likelihood. Syst. Biol. 60, 291–302 (2011).

    Article  Google Scholar 

  45. Li, W. et al. The EMBL-EBI bioinformatics web and programmatic tools framework. Nucleic Acids Res. 43, W580–W584 (2015).

    CAS  Article  Google Scholar 

  46. Zheng, X. et al. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28, 3326–3328 (2012).

    CAS  Article  Google Scholar 

  47. Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).

    Article  Google Scholar 

  48. Dahlqwist, E., Zetterqvist, J., Pawitan, Y. & Sjolander, A. Model-based estimation of the attributable fraction for cross-sectional, case-control and cohort studies using the R package AF. Eur. J. Epidemiol. 31, 575–582 (2016).

    Article  Google Scholar 

Download references

Acknowledgements

We thank all of the participants for their generous support of the current study. We would also thank R. Sun, C. Wang, H. Chen, J. Shen and C. Jie for helpful discussions on viral biology and genetic statistical, evolutionary and phylogenetic analyses, W.-S. Liu and X. Zuo for providing code support, Z. Lin (Tulane University) for kindly sharing EBV genome annotation files and J.-Y. Shao from Sun Yat-sen University Cancer Center for providing the MassArray iPlex platform. This work was supported by the National Natural Science Foundation of China (81430059 to Y.-X.Z. and 81872228 to M.X.), the National Key R&D Program of China (2016YF0902000 to Y.-X.Z., and 2018YFC1406902 and 2018YFC0910400 to W.Z.), the National Cancer Institute at the US National Institutes of Health (NIH) (R01CA115873-01 to H.-O.A. and Y.-X.Z., and R35-CA197449, P01-CA134294, U01-HG009088 and U19-CA203654 to X.L.) and the Agency of Science, Technology and Research (A*STAR), Singapore (to J.L.).

Author information

Authors and Affiliations

Authors

Contributions

Y.-X.Z., J.L. and W.Z. were the principal investigators who conceived the study. Y.-X.Z., J.L., W.Z. and M.X. designed and oversaw the study. J.L. and X.L. supervised the viral genome-wide association studies. W.W. supervised phylogenetic analysis. M.X. contributed to sample preparation, sequencing, genotyping, variant calling and genetic statistical analyses. Y.Y. contributed to sequencing, genotyping and variant calling. H.C. contributed to phylogenetic analyses. S.Z. contributed to genotyping and genetic statistical analyses. Z.Li contributed to genetic statistical analyses. Z.Z. contributed to collection of samples from the First Affiliated Hospital of Guangxi Medical College. B.L. contributed to collection of samples from the Affiliated Hospital of the Qingdao University. X.G., M.-Y.C., R.P. and R.-H.X. contributed to collection of samples from Sun Yat-sen University Cancer Center. H.-O.A., W.Y. and Y.-X.Z. supervised the design and implementation of the population-based case–control study in Zhaoqing. W.Y., E.T.C., S.-M.C., S.-H.X. and Z.Liu participated in the case–control study. The manuscript was drafted by M.X., J.L., W.Z. and Y.-X.Z., and revised by V.P. and E.T.C. All authors critically reviewed the article and approved the final manuscript.

Corresponding authors

Correspondence to Weiwei Zhai, Yi-Xin Zeng or Jianjun Liu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figures 1–15, Supplementary Tables 2–4, 6–17, 19 and 20, and Supplementary Note

Reporting Summary

Supplementary Table 1

Information of 270 EBV isolates sequenced in current study and 97 publicly accessed genomes included in our study

Supplementary Table 5

Variant information of EBV genome isolates sequenced in current study

Supplementary Table 18

The percentage of heterozygous variants in 270 EBV genome isolates

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xu, M., Yao, Y., Chen, H. et al. Genome sequencing analysis identifies Epstein–Barr virus subtypes associated with high risk of nasopharyngeal carcinoma. Nat Genet 51, 1131–1136 (2019). https://doi.org/10.1038/s41588-019-0436-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-019-0436-5

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing