Cervical cancer is the most common cancer affecting sub-Saharan African women and is prevalent among HIV-positive (HIV+) individuals. No comprehensive profiling of cancer genomes, transcriptomes or epigenomes has been performed in this population thus far. We characterized 118 tumors from Ugandan patients, of whom 72 were HIV+, and performed extended mutation analysis on an additional 89 tumors. We detected human papillomavirus (HPV)-clade-specific differences in tumor DNA methylation, promoter- and enhancer-associated histone marks, gene expression and pathway dysregulation. Changes in histone modification at HPV integration events were correlated with upregulation of nearby genes and endogenous retroviruses.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All molecular and clinical data used in this publication can be found on the National Cancer Institute’s Genome Data Commons Publication Page at https://gdc.cancer.gov/about-data/publications/CGCI-HTMCP-CC-2020. Data from this publication are publicly available for download through dbGaP (phs000528), as part of the NCI Cancer Genome Characterization Initiative (CGCI; phs000235). Sample metadata are reported in Supplementary Table 2. TCGA cervical cancer data (file name: CESC.snp__genome_wide_snp_6__broad_mit_edu__Level_3__segmented_scna_minus_germline_cnv_hg19__seg.seg.txt) were obtained from http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/CESC/20160128/. Source data are provided with this paper.
Bioinformatics analyses in this study were conducted with open-source software, with the exception of tumor purity and ploidy estimation, which was performed with Ploidetect (https://github.com/lculibrk/ploidetect).
Bodily, J. & Laimins, L. A. Persistence of human papillomavirus infection: keys to malignant progression. Trends Microbiol. 19, 33–39 (2011).
de Sanjose, S. et al. Human papillomavirus genotype attribution in invasive cervical cancer: a retrospective cross-sectional worldwide study. Lancet Oncol. 11, 1048–1056 (2010).
Wright, J. D. et al. Human papillomavirus type and tobacco use as predictors of survival in early stage cervical carcinoma. Gynecol. Oncol. 98, 84–91 (2005).
Yang, S.-H., Kong, S.-K., Lee, S.-H., Lim, S.-Y. & Park, C.-Y. Human papillomavirus 18 as a poor prognostic factor in stage I–IIA cervical cancer following primary surgical treatment. Obstet. Gynecol. Sci. 57, 492–500 (2014).
Lai, C.-H. et al. Role of human papillomavirus genotype in prognosis of early-stage cervical cancer undergoing primary surgery. J. Clin. Oncol. 25, 3628–3634 (2007).
Garland, S. M. et al. Impact and effectiveness of the quadrivalent human papillomavirus vaccine: a systematic review of 10 years of real-world experience. Clin. Infect. Dis. 63, 519–527 (2016).
Bruni, L. et al. Global estimates of human papillomavirus vaccination coverage by region and income level: a pooled analysis. Lancet Glob. Health 4, e453–e463 (2016).
Nakisige, C., Schwartz, M. & Ndira, A. O. Cervical cancer screening and treatment in Uganda. Gynecol. Oncol. Rep. 20, 37–40 (2017).
Zubizarreta, E. H., Fidarova, E., Healy, B. & Rosenblatt, E. Need for radiotherapy in low and middle income countries—the silent crisis continues. Clin. Oncol. (R. Coll. Radiol.) 27, 107–114 (2015).
Ferlay, J. et al. Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int. J. Cancer 144, 1941–1953 (2019).
Cancer Genome Atlas Research Network. Integrated genomic and molecular characterization of cervical cancer. Nature 543, 378–384 (2017).
Ojesina, A. I. et al. Landscape of genomic alterations in cervical carcinomas. Nature 506, 371–375 (2014).
Li, X. Emerging role of mutations in epigenetic regulators including MLL2 derived from The Cancer Genome Atlas for cervical cancer. BMC Cancer 17, 252 (2017).
Kelley, D. Z. et al. Integrated analysis of whole-genome ChIP–seq and RNA-seq data of primary head and neck tumor samples associates HPV integration sites with open chromatin marks. Cancer Res. 77, 6538–6550 (2017).
Lleras, R. A. et al. Unique DNA methylation loci distinguish anatomic site and HPV status in head and neck squamous cell carcinoma. Clin. Cancer Res. 19, 5444–5455 (2013).
Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013).
Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B. S. & Swanton, C. DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31 (2016).
Henderson, S., Chakravarthy, A., Su, X., Boshoff, C. & Fenton, T. R. APOBEC-mediated cytosine deamination links PIK3CA helical domain mutations to human papillomavirus-driven tumor development. Cell Rep. 7, 1833–1841 (2014).
Wallace, N. A. & Münger, K. The curious case of APOBEC3 activation by cancer-associated human papillomaviruses. PLoS Pathog. 14, e1006717 (2018).
Zhang, H.-M. et al. AnimalTFDB 2.0: a resource for expression, prediction and functional study of animal transcription factors. Nucleic Acids Res. 43, D76–D81 (2015).
Huang, F. W. et al. Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957–959 (2013).
Horn, S. et al. TERT promoter mutations in familial and sporadic melanoma. Science 339, 959–961 (2013).
Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).
Garinet, S. et al. High prevalence of a hotspot of noncoding somatic mutations in intron 6 of GPR126 in bladder cancer. Mol. Cancer Res. 17, 469–475 (2019).
Wu, S. et al. Whole-genome sequencing identifies ADGRG6 enhancer mutations and FRS2 duplications as angiogenesis-related drivers in bladder cancer. Nat. Commun. 10, 720 (2019).
Coetzee, S. G., Coetzee, G. A. & Hazelett, D. J. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 31, 3847–3849 (2015).
Chu, J. et al. BioBloom tools: fast, accurate and memory-efficient host species sequence screening using bloom filters. Bioinformatics 30, 3402–3404 (2014).
Schiffman, M., Clifford, G. & Buonaguro, F. M. Classification of weakly carcinogenic human papillomavirus types: addressing the limits of epidemiology at the borderline. Infect. Agent Cancer 4, 8 (2009).
Maranga, I. O. et al. HIV infection alters the spectrum of HPV subtypes found in cervical smears and carcinomas from Kenyan women. Open Virol. J. 7, 19–27 (2013).
Clifford, G. M. et al. Effect of HIV infection on human papillomavirus types causing invasive cervical cancer in Africa. J. Acquir. Immune Defic. Syndr. 73, 332–339 (2016).
Morris, T. J. et al. ChAMP: 450k Chip Analysis Methylation Pipeline. Bioinformatics 30, 428–430 (2014).
Tian, Y. et al. ChAMP: updated methylation analysis pipeline for Illumina BeadChips. Bioinformatics 33, 3982–3984 (2017).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Sandoval, J. et al. Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics 6, 692–702 (2011).
Shen, J. et al. Exploring genome-wide DNA methylation profiles altered in hepatocellular carcinoma using Infinium HumanMethylation 450 BeadChips. Epigenetics 8, 34–43 (2013).
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Doolittle-Hall, J. M., Cunningham Glasspoole, D. L., Seaman, W. T. & Webster-Cyriaque, J. Meta-analysis of DNA tumor–viral integration site selection indicates a role for repeats, gene expression and epigenetics. Cancers 7, 2217–2235 (2015).
Moody, C. A. & Laimins, L. A. Human papillomavirus oncoproteins: pathways to transformation. Nat. Rev. Cancer 10, 550–560 (2010).
Monk, B. J., Tian, C., Rose, P. G. & Lanciano, R. Which clinical/pathologic factors matter in the era of chemoradiation as treatment for locally advanced cervical carcinoma? Analysis of two Gynecologic Oncology Group (GOG) trials. Gynecol. Oncol. 105, 427–433 (2007).
Rader, J. S. et al. Genetic variations in human papillomavirus and cervical cancer outcomes. Int. J. Cancer 144, 2206–2214 (2019).
Hoadley, K. A. et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158, 929–944 (2014).
Ross-Innes, C. S. et al. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature 481, 389–393 (2012).
Lin-Shiao, E. et al. KMT2D regulates p63 target enhancers to coordinate epithelial homeostasis. Genes Dev. 32, 181–193 (2018).
Herz, H.-M. et al. Enhancer-associated H3K4 monomethylation by Trithorax-related, the Drosophila homolog of mammalian Mll3/Mll4. Genes Dev. 26, 2604–2620 (2012).
Hu, D. et al. The MLL3/MLL4 branches of the COMPASS family function as major histone H3K4 monomethylases at enhancers. Mol. Cell. Biol. 33, 4745–4754 (2013).
Lee, J.-E. et al. H3K4 mono- and di-methyltransferase MLL4 is required for enhancer activation during cell differentiation. eLife 2, e01503 (2013).
Hu, Z. et al. Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism. Nat. Genet. 47, 158–163 (2015).
Pokholok, D. K. et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 122, 517–527 (2005).
Gates, L. A., Foulds, C. E. & O’Malley, B. W. Histone marks in the ‘driver’s seat’: functional roles in steering the transcription cycle. Trends Biochem. Sci. 42, 977–989 (2017).
Hurst, T. P. & Magiorkinis, G. Activation of the innate immune response by endogenous retroviruses. J. Gen. Virol. 96, 1207–1218 (2015).
Newman, A. M. et al. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015).
Okoye, A. A. & Picker, L. J. CD4+ T-cell depletion in HIV infection: mechanisms of immunological failure. Immunol. Rev. 254, 54–64 (2013).
Hensley-McBain, T. & Klatt, N. R. The dual role of neutrophils in HIV infection. Curr. HIV/AIDS Rep. 15, 1–10 (2018).
Sitole, B. N. & Mavri-Damelin, D. Peroxidasin is regulated by the epithelial–mesenchymal transition master transcription factor Snai1. Gene 646, 195–202 (2018).
Zheng, Y.-Z. & Liang, L. High expression of PXDN is associated with poor prognosis and promotes proliferation, invasion as well as migration in ovarian cancer. Ann. Diagn. Pathol. 34, 161–165 (2018).
Gifford, C. A. et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell 153, 1149–1163 (2013).
McBride, A. A. & Warburton, A. The role of integration in oncogenic progression of HPV-associated cancers. PLoS Pathog. 13, e1006211 (2017).
Kajitani, N., Satsuka, A., Kawate, A. & Sakai, H. Productive lifecycle of human papillomaviruses that depends upon squamous epithelial differentiation. Front. Microbiol. 3, 152 (2012).
Ou, H. D., May, A. P. & O’Shea, C. C. The critical protein interactions and structures that elicit growth deregulation in cancer and viral replication. Wiley Interdiscip. Rev. Syst. Biol. Med. 3, 48–73 (2011).
Jeon, S., Allen-Hoffmann, B. L. & Lambert, P. F. Integration of human papillomavirus type 16 into the human genome correlates with a selective growth advantage of cells. J. Virol. 69, 2989–2997 (1995).
Groves, I. J., Knight, E. L. A., Ang, Q. Y., Scarpini, C. G. & Coleman, N. HPV16 oncogene expression levels during early cervical carcinogenesis are determined by the balance of epigenetic chromatin modifications at the integrated virus genome. Oncogene 35, 4773–4786 (2016).
Pleasance, E. et al. Pan-cancer analysis of advanced patient tumors reveals interactions between therapy and genomic landscapes. Nat. Cancer 1, 452–468 (2020).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor–normal sample pairs. Bioinformatics 28, 1811–1817 (2012).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Chun, H.-J. E. et al. Genome-wide profiles of extra-cranial malignant rhabdoid tumors reveal heterogeneity and dysregulated developmental pathways. Cancer Cell 29, 394–406 (2016).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).
Landt, S. G. et al. ChIP–seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22, 1813–1831 (2012).
Zhao, E. Y. et al. Homologous recombination deficiency and platinum-based therapy outcomes in advanced breast cancer. Clin. Cancer Res. 23, 7521–7530 (2017).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Mermel, C. H. et al. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12, R41 (2011).
Arthur, S. E. et al. Genome-wide discovery of somatic regulatory variants in diffuse large B-cell lymphoma. Nat. Commun. 9, 4001 (2018).
Ding, J. et al. Feature-based classifiers for somatic mutation detection in tumour–normal paired sequencing data. Bioinformatics 28, 167–175 (2012).
Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).
Quinlan, A. R. BEDTools: the Swiss-army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12.1–11.12.34 (2014).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).
Pellacani, D. et al. Analysis of normal human mammary epigenomes reveals cell-specific active enhancer states and associated transcription factor networks. Cell Rep. 17, 2060–2074 (2016).
Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. https://doi.org/10.1200/PO.17.00011 (2017).
This project has been funded in whole or in part with US federal funds from the National Cancer Institute, National Institutes of Health, under contract no. HHSN261200800001E and HHSN261201500003I. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US government. We gratefully acknowledge the Fred Hutchinson Cancer Research Center and the Uganda Cancer Institute for overseeing sample and data collection in Uganda. We are grateful for contributions from the other members of the HTMCP Cervical Cancer Working Group at the Department of Epidemiology, University of Alabama at Birmingham, the Pancreas Centre BC and various groups at Canada’s Michael Smith Genome Sciences Centre, including those from the Biospecimen, Library Construction, Sequencing, Bioinformatics, Technology Development, Quality Assurance, LIMS, Purchasing and Project Management teams. We thank the AIDS and Cancer Specimen Resource for logistical coordination and support of this project through NIH grants U01CA066535, U01CA096230 and UM1CA181255. L.C. and V.L.P. are the recipients of CIHR Frederick Banting and Charles Best Canada Graduate Scholarships GSD-164207 and GSD-152374, respectively. S.J.M.J. is the recipient of the Canada Research Chair in Computational Genomics. This research was supported by the Intramural Research Program of the NIH, National Cancer Institute (R.Y.). C.C. is supported by NIH grant P30AI027757. G.B.M. is supported by NCI grants U01CA217842 and P50CA098258. M.A.M. is the recipient of the Canada Research Chair in Genome Science. This work was supported in part by funding provided by the Canadian Institutes for Health Research (CIHR award FDN-143288) to M.A.M. A.I.O. was supported in part by the Endlichhofer Trust (OCCC 3120957) and a V Foundation grant (DVP2018-007).
G.B.M. reports the following potentially competing interests: SAB/consultant: AstraZeneca, Chrysallis Biotechnology, ImmunoMET, Ionis, Lilly, PDX Pharmaceuticals, Signalchem Lifesciences, Symphogen, Tarveda, Zentalis; stock/options/financial: Catena Pharmaceuticals, ImmunoMet, SignalChem, Tarveda; licensed technology: HRD assay to Myriad Genetics, DSP patents with Nanostring; sponsored research: Nanostring Center of Excellence, Ionis (provision of tool compounds). R.Y. reports the following potentially competing interests: research support from a CRADA with Celgene/BMS. T.C.W. reports the following potentially competing interests: consultant to Roche, BD and Inovio with respect to HPV diagnostic tests and therapeutic vaccines.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a. Coding mutations per Mb in samples exhibiting low (≤ 0.4) and high (> 0.4) APOBEC signatures. b. Difference in PIK3CA expression by HIV status. c. Mutations in the top 20 most mutated epigenetic modifiers, ordered by frequency of alterations for the cohort (n = 118). APOBEC signature proportion and homologous recombination deficiency (HRD) scores are reported above. HIV status, age at diagnosis, tumor histology (”other” includes neuroendocrine and undifferentiated) and stage are also annotated. d. Comparison of mutation frequencies of the 12 SMGs in the discovery vs. extension cohorts. Boxplots in a and b represent the median, upper and lower quartiles of the distribution and whiskers represent the limits of the distribution (1.5-times interquartile range), and statistics were determined using two-sided Wilcoxon rank sum tests.
Extended Data Fig. 2 Association of HPV clades with HIV status, gene expression, DNA methylation and survival.
a. HPV types in our cohort separated by HIV status (n = 72 positive samples, n = 45 negative), and clade. The x axis indicates the percentage of samples in that cohort infected by the indicated HPV type, and in brackets is the number of samples. b. Unsupervised clustering of the top 1,000 most variable genes across our cohort (n = 118 samples). q-values were determined using Benjamini-Hochberg (BH) corrected Fisher exact tests. c. Percentage of differentially methylated probes between clades (A7 = 51 samples, A9 = 56 samples) at different genomic features, by HPV clade. d. Log2 fold change and adjusted (BH) p-value of differentially expressed genes between clade A7- (n = 52) and A9-infected (n = 57) samples. e. Volcano plots showing the log2 fold change and adjusted p-value (BH) of differentially expressed genes between clade A7- (n = 52) and A9-infected (n = 57) samples associated with A9 hypermethylated (top), and A7 hypermethylated (bottom) differentially methylated regions (DMRs). f, g. top: Kernel density of E6 (f) and E7 (g) expression in the HTMCP cohort separates samples into high- and low expressing cases. bottom: gene ontologies enriched in differentially expressed genes in samples with low / high E6 (n = 68 / n = 48) (f) and E7 (n = 58 / n = 59) (g). h. Multivariate cox proportional hazards model for HPV clade, HIV status and disease stage for 66 patients. Hazard ratios and p-values reported for each variable were determined using log-rank tests. Where relevant, all statistical tests were two-sided.
a. Cluster of clusters analysis for 54 consensus clustering solutions for all histone marks on 52 samples (solutions with k = 2 to 10 for each mark). The heatmap color indicates the sample probabilities in the consensus matrix. q-values for each variable were determined using Benjamini-Hochberg corrected Fisher exact tests. b. Schematic showing the cluster of clusters solution (k = 5 for H3K27ac and H3K4me3 and k = 4 for the other marks) for all histone marks and for the 3 active marks. Each dot represents a sample and dot color represents the cluster membership of the sample. Hollow circles indicate no available ChIP data for that sample. c. Fold change of H3K4me3 abundance and gene expression between clades associated with TSS of genes (−5/+20 kb) found at intersecting H3K4me3 and H3K27ac peaks. Sample Ns used for differential analyses (and derivation of adjusted p-values) were: expression A7 = 52, A9 = 57; H3K4me3 and H3K27ac A7 = 25, A9 = 22. Genes with BH-adjusted p-values <0.05 (DESeq, Methods) are highlighted. d. Expression of the genes reported in Fig. 4f separated by HPV clade. Boxplots represent the median, upper and lower quartiles of the distribution and whiskers represent the limits of the distribution (1.5-times interquartile range), and p-values were calculated by Wilcoxon rank sum tests. Where relevant, all statistical tests were two-sided.
a. Number of HPV integration sites per event separated by HPV clade. b, c, e, f. Distribution of the number (b, e) and fold change in integrated samples (c, f) of genes (b, c) and ERVs (e, f) near integration events. d. Expression (RPKM) of selected genes near HPV integration events in each sample (n = 118). g. Fold change of ERVs nearby integration events separated based on the clusters identified in Fig. 5f. h. Histone mark coverage of a 115 kb genomic region containing ERVs. The line represents an integration event, and arrows indicate individual integration sites. Top tracks refer to a case with integration, and the bottom to a control case without integration. i. Total T-cell scores and estimated tumor content of samples with HPV integration events that are associated with significant changes in expression of ERVs or genes, and those that are not. j, k. CIBERSORT scores for all CD4 + T-cells (sum) and CD8 + T-cells (j), Follicular helper T-cells and neutrophils (k) separated by HIV status (HIV + n = 72, HIV- n = 45). Boxplots in a, d, g, i–k represent the median, upper and lower quartiles of the distribution and whiskers represent the limits of the distribution (1.5-times interquartile range). All p-values were determined by Wilcoxon tests unless otherwise stated, and q-values were corrected using the Benjamini-Hochberg method. Where relevant, all statistical tests were two-sided.
Source data for data presented in Fig. 1.
Source data for data presented in Fig. 2.
Source data for data presented in Fig. 3.
Source data for data presented in Fig. 4.
Source data for data presented in Fig. 5.
Source data for data presented in Extended Data Fig. 1.
Source data for data presented in Extended Data Fig. 2.
Source data for data presented in Extended Data Fig. 3.
Source data for data presented in Extended Data Fig. 4.
About this article
Cite this article
Gagliardi, A., Porter, V.L., Zong, Z. et al. Analysis of Ugandan cervical carcinomas identifies human papillomavirus clade–specific epigenome and transcriptome landscapes. Nat Genet 52, 800–810 (2020). https://doi.org/10.1038/s41588-020-0673-7