Identification of novel methylation markers in HPV-associated oropharyngeal cancer: genome-wide discovery, tissue verification and validation testing in ctDNA

Human papilloma virus (HPV)-associated oropharyngeal cancer (OPC) is an independent tumour type with regard to cellular, biological, and clinical features. The use of non-invasive biomarkers such as circulating tumour DNA (ctDNA) may be relevant in early diagnosis and eventually improve the outcomes of patients with head and neck squamous cell carcinoma (HNSCC). Genome-wide discovery using RNA sequencing and reduced representation bisulfite sequencing yielded 21 candidates for methylation-targeted genes. A verification study (252 HNSCC patients) using quantitative methylation-specific PCR (Q-MSP) identified 10 genes (ATP2A1, CALML5, DNAJC5G, GNMT, GPT, LY6D, LYNX1, MAL, MGC16275, and MRGPRF) that showed a significant increase recurrence in methylation groups with OPC. Further study on ctDNA using Q-MSP in HPV-associated OPC showed that three genes (CALML5, DNAJC5G, and LY6D) had a high predictive ability as emerging biomarkers for a validation set, each capable of discriminating between the plasma of the patients from healthy individuals. Among the 42 ctDNA samples, methylated CALML5, DNAJC5G, and LY6D were observed in 31 (73.8%), 19 (45.2%), and 19 (45.2%) samples, respectively. Among pre-treatment ctDNA samples, methylated CALML5, DNAJC5G, and LY6D were observed in 8/8 (100%), 7/8 (87.5%), and 7/8 (87.5%) samples, respectively. Methylated CALML5, DNAJC5G, and LY6D were found in 2/8 (25.0%), 0/8 (0%), and 1/8 (12.5%) of the final samples in the series, respectively. Here, we present the relationship between the methylation status of three specific genes and cancer recurrence for risk classification of HPV-associated OPC cases. In conclusion, ctDNA analysis has the potential to aid in determining patient prognosis and real-time surveillance for disease recurrences and serves as an alternative method of screening for HPV-associated OPC.


Introduction
Head and neck squamous cell carcinoma (HNSCC) prognoses remain poor due to high recurrence rates [1]. The prognosis of HNSCC is largely based on pathological staging systems. Current post-treatment surveillance of HNSCC patients is monitored via clinical evaluation combined with flexible endoscopy and conventional imaging [2]. Genomic sequencing has delivered vast amounts of information regarding different cancer subtypes and is providing new therapeutic targets [3]. The hope is that effective and precise theragnostic-based options will provide medical benefit to HNSCC patients.
The incidence of human papilloma virus (HPV)-associated oropharyngeal cancer (OPC) has increased in recent decades and is now the most prevalent HPV-related cancer in developed countries [4]. An estimated 70-80% of OPCs are now attributed to HPV in the United States, western Europe and Japan [5]. Compared with smoking-related or alcohol-related OPC, HPV positivity is associated with a better prognosis [6]. Despite the prognostic value added by HPV status, subsets of patients continue to demonstrate outcomes discordant with their disease stage. Several studies have demonstrated that risk stratification of patients with HPV-associated OPC is caused by tobacco smoking history: low risk (HPV-associated, 10 pack years) and intermediate risk (HPV-associated, >10 pack years) [7][8][9]. Previous studies have suggested that the combination of HPV DNA and p16 testing was a significantly better prognostic marker than either HPV DNA or p16 alone [10].
Recently, the potential usefulness of circulating tumour DNA (ctDNA) as a biomarker was shown in several applications in cancer management. A liquid biopsy utilising tumour DNA found in blood could address many limitations of tissue biopsies, while also opening the innovative paradigms in cancer care. Analyses using DNA methylation-based strategies for detection of ctDNA have suggested that such approaches may provide new advance for early cancer diagnosis [11]. As ctDNA is composed largely of short fragments, short amplicons are required for maximum sensitivity of PCR reactions, particularly if mutations are being detected [12]. Recent studies highlight this potential for ctDNA methylation, demonstrating that even with cancer-specific epigenetic changes, sufficient cell-type specificity remains to be determined to permit identification of cell-of-origin [13].
To our knowledge, our study is the first to suggest that Calmodulin-like 5 (CALML5), DnaJ heat shock protein family member C5 gamma (DNAJC5G), and Lymphocyte antigen 6 complex locus D (LY6D) methylation is associated with worse disease-free survival (DFS) and may be a critical event in HPV-associated OPC. To improve the survival rate for head and neck cancer, real-time monitoring of patients following surgery as a measure of therapeutic efficacy can be carried out. Here, we report the results of a combined discovery, verification, and validation study aimed at identifying novel blood-based DNA methylation markers, and document their ability to efficiently improve the diagnosis and prognosis of HPV-associated OPC.

Results
Experimental setup of the study and genome-wide epigenetic profiling using RNA sequencing (RNAseq) and reduced representation bisulfite sequencing (RRBS) An outline of the study is given in Fig. 1. A total of 1,968 markers were upregulated at least two-fold after 5-aza-2′-deoxycytidine (5-Aza-CdR) treatment compared to untreated controls identified using RNA-seq. In the RRBS data, we selected 284 hypermethylated promoter regions with more than a 20-fold difference between the tumour and normal tonsillar samples. By combining the RNA-seq data with the RRBS data, we identified 21 possibly methylated markers for HPV-associated OPC. The findings from the discovery profiling , AGAP2, ALPL, ANGPTL2, ATP2A1,  CALML5, DNAJC5G, FDFT1, GNMT, GPT, HOXB3,  KLK11, LMF1, LY6D, LYNX1, MAL, MGC16275,  MRGPRF, NKPD1, SH2D3C, TNNI2, and ZNF876P, were further analysed using quantitative methylation-specific PCR (Q-MSP) in an independent large series of HNSCC specimens.

Verification analysis of methylation status in HNSCC tissue samples
A Q-MSP analysis of the methylation status of 21 epigenetic marker genes was performed using 252 primary HNSCC samples, including 35 HPV-associated OPC, 35 HPV-negative OPC, 60 hypopharyngeal cancers, 49 laryngeal cancers, and 73 oral cavity cancer samples (Fig. 2a). The methylation frequencies for these genes are summarised in Table 1 and Fig. 2b. Site-specific methylation frequencies across the 21 genes for the oropharynx, hypopharynx, larynx, and oral cavity are shown in Fig. 2c. There were no significant associations between the methylation index and the primary tumour site.

Association between methylation index and clinicopathological characteristics
For the full panel of HPV-negative oropharyngeal, hypopharyngeal, laryngeal, and oral cancer patients, there were no significant associations with any clinicopathological characteristics ( Fig. 3a-f). Among the HPV-associated OPC, the methylation index was significantly lower in females than in male patients, as well as in non-drinkers relative to drinkers (Fig. 3b).

Stratification analysis
The relation between the methylation status and risk of recurrence was analysed through multivariate analysis using a Cox proportional hazards regression model adjusted for age, gender, smoking status, alcohol consumption, and clinical tumour stage. The methylation of the 21 marker genes was not associated with recurrence in patients with hypopharyngeal, laryngeal, or oral cancers (Fig. 4a). In patients with oropharyngeal cancers (HPV-associated and HPV-negative), we found that hypermethylation of ATP2A1, DNAJC5G, GNMT, GPT, LYNX1, MAL, and MRGPRF were associated with significantly reduced DFS, with hazard ratios of 4. 17 (Fig. 4c). In patients who had HPV-negative OPC, the methylation of the 21 marker genes was associated with an elevation in the odds of recurrence that was not significant (Fig. 4d).

Kaplan-Meier estimates in HPV positive oropharyngeal cancer patients
Based on the HPV-associated OPC patients, Kaplan-Meier survival curves for each of the 21 genes are shown in Fig. 5. The DFS time did not significantly differ between patients with methylated genes and those with unmethylated genes, with four notable exceptions: DFS was significantly shorter when GPT, LY6D, MAL, and MRGPRF were methylated (P = 0.021, P = 0.019, P = 0.012, and P = 0.007, respectively; Fig. 5).

Serial ctDNA methylation analysis
The interrelationship between clinical evaluations and changes in serial ctDNA methylation levels of CALML5, DNAJC5G, and LY6D in samples from Patients 1-8 appears in Fig. 7a

Discussion
The biomarkers we describe have the potential to affect one of the most important unmet clinical needs in HPVassociated OPC: a risk classification system that is more effective than patient history and physical examination [14]. The advantages of our study are the evaluation of a large, independent series of different patient-derived sample types, and the use of a discovery-verification-validation approach, which allowed us to evaluate the succession of methylation events and the selection of methylation targets which are prognostically relevant and associated with HPV-associated carcinogenesis.
Plasma circulating tumour HPV DNA (ctHPVDNA) is a promising biomarker for monitoring treatment responses in patients with HPV-associated OPC [15]. According to the results, the patients with ctHPVDNA OPC possessed significantly higher HPV DNA viral loads in their tumour tissues than the ctHPVDNA-negative patients [16]. Longitudinal monitoring of ctHPVDNA during post-treatment surveillance could accurately detect clinical disease recurrence [17]. The detection of small tumours indicates that ctHPVDNA could be readily used in early detection screening [18].
Tumour DNA, as determined by the presence of somatic mutations (TP53: 72%, p16 (CDKN2A): 22%, p110a (PIK3CA): 21%, and Notch Receptor 1 (NOTCH1): 19%) is highly specific for HNSCC in The Cancer Genome Atlas (TCGA) data [19]. The use of mutations commonly found in cancers is clinically attractive, and quantification of TP53 mutations in ctDNA from HNSCC patients using digital PCR is technically feasible [20]. PIK3CA mutation was detected in the plasma samples of 31% HNSCC patients [21]. The detection of early diagnosis markers becomes slightly more challenging when the number of alterations need to be detected across large multi-exon genes, such as TP53 or NOTCH1 [22].
HNSCC is characterised by some early epigenetic alterations, several of these being biomarkers found in circulation [23]. DNA methylation of certain genes in other cancers sites can be found among the ctDNA and has been studied by many groups, but it should be noted that substantial numbers of healthy controls also have methylation of ctDNA for these genes [24,25]. At present, there are only a few studies which investigated a relation between ctDNA methylation events and prognosis in HNSCC [26,27]. Quantitative DNA methylation levels for septin 9 (SEPT9) and short-stature homeobox 2 (SHOX2) in the ctDNA proved to be powerful prognostic and molecular staging biomarkers for identifying patients at a higher risk of tumour recurrence [26]. Hypermethylation of endothelin receptor type B (EDNRB) was identified in the serum of 10% of patients with HNSCC with a high degree of specificity [27].
By unbiased comprehensive stepwise verification and validation studies, we identified three new promising methylation markers (CALML5, DNAJC5G and LY6D) associated with HPV-associated OPC. To the best of our knowledge, methylation of CALML5 and DNAJC5G have not been described before in HNSCC, whereas LY6D was identified previously [28].
CALML5, a skin-specific calcium binding protein, is closely associated with keratinocyte differentiation [29]. CALML5 protein is a regulator of terminal differentiation of epidermal cells, and high-density CALML5 conditions promote YAP1 translocation to the cytoplasm [30]. CALML5 ubiquitination in the nucleus is involved in the carcinogenesis of breast cancer in young women [31].
The human genome encodes more than 40 identified DNAJ protein family members, which can be classified into Fig. 3 Association between methylation index and selected clinical parameters. Mean methylation index for the various groups compared using Student's t test. Associations between methylation index and selected epidemiologic and clinical characteristics: a Among the 252 cases; no differences were noted with regard to any of the clinical characteristics; b HPV-associated OPC; statistically significant relationships were found between the methylation index and gender or alcohol consumption; c HPV-negative oropharyngeal cancer; d hypopharyngeal cancer; e laryngeal cancer; f oral cancer. Means and standard deviations are indicated, and statistical comparisons between groups are depicted. *P < 0.05 was considered to represent a statistically significant difference.  [32]. DNAJC5G significantly inhibits the replication efficiencies of adenovirus, vaccinia virus, and HIV-1 [33]. Another member of the HSP40 proteins, DNAJA3, was initially characterised through its interaction with the HPV16 E7 oncoprotein [34].
LY6D is a member of the LY6 family, and it is a membrane-bound protein with a glycosylphosphatidylinositol anchor [35]. LY6D plays an important role in the adhesion of head and neck cancer cells to endothelial cells, and is detected of micrometastases in lymph nodes of patients with HNSCC [28]. LY6D may serve as a prognostic maker for advanced prostate cancer and oestrogen receptor-positive breast carcinomas [36,37]. Recently, aberrant DNA hypermethylation in the LY6D gene promoter region has been reported in several tumours such as a gefitinib-resistant lung cancer cell line and helicobacterassociated gastric cancer in mice [38,39].
Liquid biopsy needs further methodological evaluation as well as cost-based assessment [40]. A smooth translation into clinical practice requires a systematic assessment of the health economic benefits [41]. Patients who have undetectable ctHPVDNA during clinical follow-up are unlikely to have recurrent disease and may be spared routine radiographic and in-office pharyngoscopic surveillance [17]. Furthermore, ctHPVDNA surveillance may be more cost effective than current surveillance approaches (e.g., PET-CT) [15].
Currently, the lack of an effective OPC screening program is because there are no identified OPC precursor lesions. The HPV-associated carcinogenesis initially arises in tonsillar crypts, which may be the most likely reason why the HPV prevalence in tonsillar cells is higher [4]. In the future, we think that highly sensitive and tissue-specific ctDNA biomarkers of HPV-associated OPC will be discovered, allowing early detection. This study, involving human specimens and high-throughput profiling platforms, may be susceptible to measurement bias from various sources; accordingly, the use of methylation markers in clinical practice requires further testing in prospective studies with larger HNSCC cohorts. Eventually, these efforts will lead to the identification of new oncological biomarkers for early detection and outcome prediction, which is a prerequisite for realising the advantages of precision medicine. Our study demonstrated the utility of using parallel

Cell lines and treatments
The UM-SCC-47 cell line (HPV16 positive), derived from a primary tumour of the lateral tongue of a male patient, was obtained from the University of Michigan (Ann Arbor, MI, USA). The cells were cultured in Dulbecco's modified Eagle's medium (FUJIFILM Wako Pure Chemical Corporation, Osaka, Japan) supplemented with 10% foetal bovine serum (Gibco, Thermo Fisher Scientific Inc., Waltham, MA) and 1% penicillin/streptomycin (Wako) in a humidified atmosphere containing 5% CO 2 at 37°C. Twelve hours after plating, cultures were incubated either for 48 h with 5-Aza-CdR (5 μM; Sigma, St. Louis, MO). The medium was then removed, and cultures were maintained in standard Dulbecco's modified Eagle's medium, which was replaced every other day.

Clinical tumour samples
Surgical HNSCC tumour and matched adjacent non-tumour tissues were obtained from 252 patients who underwent surgical resection at the Department of Otolaryngology/ Head and Neck Surgery, Hamamatsu University School of Medicine (Hamamatsu, Shizuoka, Japan). Written informed consent was obtained from individual patients before surgery and the experimental protocol was approved by the Hamamatsu University School of Medicine (date of board approval: ethics code: 25-149 and 17-041). The ratio of males to females was 217:35. The mean age was 65.0 years (range, 32-92 years). Primary tumours were composed of 35 HPV-associated OPC, 35 HPV-negative OPC, 60 hypopharyngeal carcinomas, 49 laryngeal carcinomas, and 73 oral cavity carcinomas (Supplementary Table S1).

Liquid biopsy samples
ctDNA was isolated from 4.0 mL of blood plasma by affinity-based binding to magnetic beads according to the manufacturer's instructions (QIAamp MinElute ccfDNA Kit, QIAGEN). Peripheral blood (10 ml each) was collected in cell-stabilising tubes (Cell-Free DNA Collection Tube, Roche, CA, USA). ctDNA was extracted on the day of surgery or outpatient visits. Eight patients with HPVassociated OPC and eight healthy volunteers were also enroled as validation analysis. Patient characteristics are shown in Supplementary Table S2.

RNA-seq analysis
RNA-seq analysis was performed against cancer cell lines treated with/without 5-Aza-CdR. Total RNA was isolated using an RNeasy Plus Mini Kit (QIAGEN); cDNA was synthesized using a ReverTra Ace qPCR RT Kit (Toyobo, Tokyo, Japan). Libraries for RNA-seq were prepared using the Sur-eSelect Strand-Specific RNA Library Preparation Kit (Agilent Technologies, Santa Clara, CA) and subjected to massively parallel sequencing with a GAIIx (Illumina, San Diego, CA) using a single-end 36-bp or 50-bp sequencing length protocol. Sequenced tags were aligned to the human reference genome (build hg19) using CASAVA 1.8.2 (Illumina), and gene expression was normalized to the amount of reads per kilobase of exon per million mapped (rpkm) (Fig. 1a).

DNA extraction and bisulfite modification
DNA extraction from fresh tissue was performed using a QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany). Sodium bisulphite conversion was performed using the MethylEasy Xceed rapid DNA bisulfite modification kit (TaKaRa, Tokyo, Japan) following the manufacturer's protocol.

RRBS
Total 500 ng genomic DNA extracted from the cells were first digested with MspI restriction enzyme. DNA fragments were then subjected to end-repair and dA-tailing using an Fig. 7 Methylation status for each patient in serial ctDNA samples of HPV-associated OPC patients receiving treatment at different time points during follow-up. a Patient 1, initial treatment was a right oropharyngectomy and bilateral ND. Concomitant CRT was administered postoperatively. During follow-up, the patient was also found to have fluorodeoxyglucose-avid lung nodules as well, one of which was subsequently extracted, confirming metastatic HPV16 positive SCC. After one dose of 5-FU+ CDDP+ cetuximab and seven doses of nivolumab, the clinical response was a stable disease (SD) during chemotherapy. After initial treatment, three ctDNA methylation markers were non-decreasing. However, three ctDNA methylation levels were showing a significant decrease due to the response to the chemotherapy. b Patient 2 experienced a reduction in ctDNA methylation levels after concurrent CRT. c Patient 3 represented a drop in ctDNA methylation levels after oropharyngectomy and bilateral ND. d Patient 4, CALML5 and DANJC5G methylation levels gradually decreased after oropharyngectomy and a left modified ND. In this case, the elevation of CALML5 and LY6D methylation levels correlated with regional lymph node metastasis. e Patient 5, ctDNA methylation levels gradually decreased after oropharyngectomy and a left modified ND. f Patient 6 experienced a reduction in ctDNA methylation levels after oropharyngectomy and a left ND. In this case, CALML5 methylation level enhances metastasis of regional lymph node. g Patient 7 was treated by an excision of the tonsillar tumour with bilateral modified ND. After 12 months, the three ctDNA methylation markers were undetectednot detected. h Patient 8, received concurrent CRT and, experienced a reduction in ctDNA methylation levels.
NEBNext DNA Library Prep Master Mix Set (NEB, Ipswich, MA), followed by the ligation with methylated adaptor oligo-DNA (NEB). After the size selection and the bisulfite conversion from unmethylated cytosine to uracil by using an EZ DNA Methylation kit (Zymo Research, Irvine, CA), DNA fragments were amplified by 18 cycles of PCR using primers included in the NEBNext kit. The generated library was sequenced by the 36 bp single-end protocol of GAIIx (Illumina). As unmethylated cytosine of the library is converted to thymine during the PCR amplification, mapping of sequenced tags to human reference genome (hg19, UCSC Genome Browser) was performed by Bismark software [42]. In this study, methylation level was estimated in CpGs with ≥10 reads (713,353 CpGs per sample in an average). Statistical analysis was done by methyl kit [43]. The promoters were defined as regions spanning −2.5 to +0.5 Kb from transcription start sites and containing ≥5 CpGs, and those being read ≥10 times were subjected to differential methylation analysis (Fig. 1a).

Q-MSP
Aberrant DNA methylation, which often occurs around the transcriptional start site (TSS) within a CpG island, was evaluated using Q-MSP. The sequences of primers used in this study are shown in Supplementary Table S3. A standard curve for Q-MSP was constructed by plotting five serially diluted standard solutions of EpiScope methylated HeLa gDNA (TaKaRa). To analyse the normalised methylation value (NMV) of the PCR conditions, an analytical method was used as previously described [44].
Detection of high-risk HPV DNA using PCR and p16 immunostaining HPV status was determined using the HPV Typing Set (Takara), a primer set for PCR specifically designed to identify HPV genotypes 16,18,31,33,35,52, and 58 using primary tumor DNA. The PCR HPV Typing Set method was performed according to the manufacturer's instructions. To identify the HPV status of ctDNA, samples were also subjected to PCR using specific primers for HPV type 16. The primers for the HPV16 E7 sequence detection were 5′-TCCAGCTGGACAAGCAGAAC-3′ (forward primer), 5′-CACAACCGAAGCGTAGAGTC-3′ (reverse primer) [45]. For the immunohistochemistry analysis, an anti-human p16 monoclonal antibody (clone: E6H4, Roche Diagnostics GmbH, Germany) was used ( Supplementary Fig. S1).

Data analysis and statistics
A receiver operator characteristic (ROC) curve analysis of the target genes was performed with the NMVs for 79 matched paired HNSCC and normal mucosal samples using the Stata/SE 13.0 system (Stata Corporation, College Station, TX, USA). To determine the area under the ROC curve, the true positive rate (sensitivity) was plotted as a function of the false positive rate (1-specificity) for different cut-off points, and the NMV thresholds were calculated for each target gene. Cut-off values showing the greatest accuracy were determined based on sensitivity/specificity, as indicated in Supplementary Table S4. The methylation index was defined as the number of genes with promoter methylation [46]. Student's t tests were performed to evaluate the associations between the clinical variables and the methylation index. DFS was investigated using the Kaplan-Meier method and the log-rank test. The probability of survival can be evaluated by generating a Kaplan-Meier curve. A Cox's proportional hazards regression analysis that included age (≥65 vs. <65 years), gender, alcohol intake, smoking status, and tumour stage (I-II vs. III-IV) and methylation status was used to identify the multivariate predictive value of prognostic factors. A value of P < 0.05 was considered statistically significant.
Author contributions KM and AI conceived the study. KM and YM designed the experiments. HM, AK, DM, MM, SY, TK, TN and HM analysed the data and prepared figures and tables. All authors wrote the manuscript, reviewed its drafts, approved its final version and agreed with its submission.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Ethics approval and consent to participate The research methodology employed in this study was approved by The Institutional Review Board of the Hamamatsu University School of Medicine. All study subjects provided written informed consent.
Consent for publication Consent for publication was obtained from all patients.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.