Performance of DNA methylation assays for detection of high-grade cervical intraepithelial neoplasia (CIN2+): a systematic review and meta-analysis

Background To conduct a meta-analysis of performance of DNA methylation in women with high-grade cervical intraepithelial neoplasia (CIN2+). Methods Medline and Embase databases were searched for studies of methylation markers versus histological endpoints. Pooled sensitivity, specificity and positive predictive value (PPV) for CIN2+ were derived from bivariate models. Relative sensitivity and specificity for CIN2+ compared to cytology and HPV16/18 genotyping were pooled using random-effects models. Results Sixteen thousand three hundred thirty-six women in 43 studies provided data on human genes (CADM1, MAL, MIR-124-2, FAM19A4, POU4F3, EPB41L3, PAX1, SOX1) and HPV16 (L1/L2). Most (81%) studies evaluated methylation assays following a high-risk (HR)-HPV-positive or abnormal cytology result. Pooled CIN2+ and CIN3+ prevalence was 36.7% and 21.5%. For a set specificity of 70%, methylation sensitivity for CIN2+ and CIN3+ were 68.6% (95% CI: 62.9–73.8) and 71.1% (95% CI: 65.7–76.0) and PPV were 53.4% (95% CI: 44.4–62.1) and 35.0% (95% CI: 28.9–41.6). Among HR-HPV+ women, the relative sensitivity of methylation for CIN2+ was 0.81 (95% CI: 0.63–1.04) and 1.22 (95% CI: 1.05–1.42) compared to cytology of atypical squamous cells of undetermined significance, or greater (ASCUS+) and HPV16/18 genotyping, respectively, while relative specificity was 1.25 (95% CI: 0.99–1.59) and 1.03 (95% CI: 0.94–1.13), respectively. Conclusion DNA methylation is significantly higher in CIN2+ and CIN3+ compared to ≤CIN1. As triage test, DNA methylation has higher specificity than cytology ASCUS+ and higher sensitivity than HPV16/18 genotyping.


BACKGROUND
Invasive cervical cancer (ICC) is one of the most common female cancers in low and middle-income countries (LMIC), where 85% of the estimated 570,000 global annual cases occur and is the leading cause of cancer deaths among women in these settings. 1 ICC is one of the most preventable and treatable forms of cancer, as long as it is detected early and managed effectively. In May 2018, the Director-General of the World Health Organization (WHO) made a global call for action towards the elimination of ICC calling for more innovative technologies for detection of precancerous lesions and better strategies to increase ICC screening coverage and uptake. 2 There is strong evidence that high-risk human papillomavirus (HR-HPV) DNA based screening is more sensitive for the detection of highgrade CIN (CIN2+) and is effective in prevention of ICC compared to cervical cytology and visual inspection. 3 However, HPV testing detects many transient infections, meaning that its specificity for high-grade CIN is low, 4 which has important implications for screening women with high prevalence of HR-HPV. Novel methods are required that are sensitive enough to detect clinically relevant HPV needing colposcopy referral but with high specificity to rule out HPV-positive women without evidence of disease, thereby avoiding repeat testing which can result in substantial loss to follow-up, 5 as well as avoiding unnecessary referrals for colposcopy, which increase the workload and costs to the services. DNA methylation of human genes and HPV virus occur during HR-HPV infection and precancerous tissue progression, leading to alterations in the functions of gene products regulating tumour suppression. 6,7 Such aberrant DNA methylation may help distinguish non-progressive HPV infections from those that will progress to cancer. Increased DNA methylation has been shown to be associated with increasing persistence of HR-HPV genotypes, 8 severity of CIN lesions 9 and risk of invasive cancer. 10 Recent studies evaluating DNA methylation of human genes and the HPV virus for detection of HPV related lesions included different human genes and HPV genotypes. Furthermore, these studies varied in the CpG (cytosine followed by guanine) dinucleotide sites chosen, many of which occur in the human genome, in contrast to the HPV genome, which does not have any clearly discernible CpG islands. 6 Previous systematic reviews have summarised the association and performance of DNA methylation www.nature.com/bjc for CIN2+ and CIN3+ detection, 6,8,9,11 although none have yet quantified the performance in a meta-analysis.
The aim of this review and meta-analysis is to evaluate the performance of various DNA methylation markers (human genes and HPV virus) for detection of CIN2+ and CIN3+. The novelty of our review is that it evaluates: (i) the association of host and HPV methylation positivity with increasing grades of CIN (Analysis 1); (ii) the pooled sensitivity, specificity and positive predictive value (PPV) of DNA methylation markers for the detection of CIN2+ and CIN3+ in a triage setting (i.e. following HR-HPV-positive test or abnormal cytology; Analysis 2) and (iii) the relative sensitivity and relative specificity of DNA methylation markers compared to HPV16/18 genotyping and cytology (using a cut-off of atypical squamous cells of undetermined significance, or greater [ASCUS+] or low-grade squamous intraepithelial lesions, or greater [LSIL+]), for the detection of CIN2+ and CIN3+ in a triage setting (Analysis 3).

Study outcomes
Studies were included if they reported the percentage of DNA methylation according to CIN grade, or sensitivity and specificity of the DNA methylation assays for the detection of the outcome, or if the receiver operating characteristic (ROC) curve was provided from which sensitivity and specificity estimates could be obtained.
Studies were included if methylation markers were assessed against a histological endpoint of CIN grade 2 or higher (CIN2/3, CIN2+ or CIN3+ which can include carcinoma in situ and ICC). Studies with cytological endpoint assessment only were excluded because of the lower sensitivity for cytology measures in detection of high-grade disease. 12  Inclusion and exclusion criteria Studies reporting methylation within biopsy specimens were excluded as we aimed to evaluate the performance of DNA methylation assays as a potential primary screening or triage tests when cervical swabs would be used. Studies that reported only crude percentage methylation estimates without a validated cutoff for CIN2+ detection were excluded as they were not verified or validated. Studies were excluded if cancers represented greater than 10% of all samples and it was not possible to separately analyse the cancers, the CIN2 and the CIN3, due to the risk of spectrum bias related to the fact that the majority of cancers have very high levels of methylation.

Search
Studies not in the English language or conference abstracts were excluded due to difficulty in assessing the quality of the methodology, as were studies with fewer than 25 participants, which could result in an unacceptably imprecise effect measurement. Whereby publications provided DNA methylation measures using a combination set of gene markers, the DNA methylation of the individual markers as well as the combination panel were presented separately in the results. The combination tests were positive when any of the included gene markers were positive.
Our review was restricted to DNA methylation markers where there were 4 or more studies evaluating the performance of an individual marker (to reduce the potential heterogeneity when pooling a small number of studies), or if the marker had been evaluated as part of a large population-based screening study. Studies reporting only DNA methylation of HPV16 were included given the small number of studies evaluating DNA methylation of other HPV types.
Statistical analysis Analysis 1. The percentage methylation (methylation positivity) was extracted for each grade of CIN (≤CIN1, CIN2, CIN3 and ICC) according to pre-defined thresholds established or if pre-defined thresholds were not available, methylation positivity was calculated from ROC curves based on a set specificity of 70% for CIN2+/CIN3+ detection by one author (HK) and validated by a second author (AL). Crude (unadjusted) Odds Ratios (OR) and 95% Confidence Intervals (CI) were calculated for methylation positivity associated with each grade of discrete grades of high-grade CIN (CIN2, CIN3 and ICC) compared to CIN1 or normal (≤CIN1). Random-effects meta-analysis were used to estimate pooled effects to account for between-study heterogeneity and heterogeneity was examined using the I 2 statistic. 13 Sub-group analyses by DNA methylation marker were performed to compare pooled effects and heterogeneity. Analysis 2. The numbers of true positives, false positives, true negatives and false negatives were extracted where available, obtained using study-specific thresholds to define methylation positivity. Where several thresholds for methylation positivity were reported or where only ROC curves were presented, sensitivity data were extracted based on a threshold that produced a predefined set specificity of 70% and separately a set specificity of 50%.
The bivariate model 14 was used to estimate pooled sensitivity and specificity using metandi and midas in STATA, whereby pairs of sensitivity and specificity are jointly analysed, incorporating any correlation that might exist between these two measures using a random-effects approach. Individual meta-analyses were performed for each of the human gene and HPV methylation markers. Because methylation markers are not independent of each other and given that most methylation markers perform better combined in a panel, a meta-analysis of combination markers was also performed, where available. To account for correlation between sensitivity and specificity, we used the hierarchical summary receiver operating characteristic (HSROC), 15 which allows for threshold effects and between-and within-study variability, by allowing both test accuracy to vary across studies. Heterogeneity in the Forest Plots was assessed by visually examining the confidence intervals of individual studies.
A bivariate logitnormal random-effects model 16 was used to estimate pooled PPV from the observed prevalence of CIN2+ and CIN3+ (Model 1). To account for varying observed CIN2+/CIN3+ prevalence in included studies, the pooled sensitivity and specificity estimates obtained using the bivariate model 14 were used to generate a pooled PPV for varying expected CIN2+ and CIN3+ prevalence using PPV = Prev*SE/[Prev*SE+(1 − Prev)*(1 − Spec)] 17 (Model 2). We assumed no change in performance of DNA methylation assays with increasing prevalence of CIN2+/CIN3+. Analysis 3. Relative sensitivity and relative specificity and 95% Confidence Interval (CI) of DNA methylation assays for CIN2+ and CIN3+ detection were compared to other test strategies most widely reported, including HPV16/18 genotyping and cervical cytology (ASCUS+ and LSIL+) evaluated as triage tests following a HR-HPV-positive test. Where studies did not restrict inclusion to HR-HPV-positive women only, the performance of DNA methylation assays was compared to that of qualitative HR-HPV DNA-based tests (Hybrid Capture II or PCR). Only those studies that provided direct head-to-head comparison of the two methods on the same population were included. The data on true positive, false positive, true negative and false negative for each test method and for each study were extracted into Excel spreadsheet and imported into SAS. The sensitivity and specificity of DNA methylation were compared to that cytology and/or HPV16/18 genotyping using metadas in SAS, 18 which allows comparison of test method through inclusion of test method as a covariate. 19 We used sensitivity estimates for DNA methylation assays based on a threshold to achieve 70% specificity where studies allowed.
For each of the three analyses, separate sub-analyses were conducted for discrete outcomes of CIN2+ and CIN3+. Data were analysed using Stata (version 16) and SAS (version 9.4).
Methodological quality assessment Study quality was assessed using the QUADAS-2 tool for the quality assessment of diagnostic accuracy studies. 20 Assessments were based on: participant selection characteristics (location, inclusion and exclusion criteria, study size and age distribution); proportion of women with CIN2+/CIN3+ included; whether the index test (DNA methylation assay) and reference test (histology) were well described; indication for biopsy (i.e. whether all women had biopsy taken irrespective of screening or triage test abnormality) and whether there was independent validation of histopathology diagnosis (Supplementary Tables 1, 2).
Studies were ranked in quality/robustness of design (linked to evidence for effectiveness of cervical cancer screening) in decreasing order of randomised clinical trial or randomised population-based trial, cohort studies, case-control studies and convenience sampling studies. 21 Our review was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 22 and the Meta-analysis of Observational Studies in Epidemiology (MOOSE) guidelines. 23 This review is registered on the PROSPERO database at the Centre of Reviews and Dissemination, University of York; registration number CRD42016052119. The full dataset is available online at (https://doi.org/10.17632/84khm3rf8k.1).
The quality of individual studies assessed using QUADAS-2 scores is summarised in Fig. 2, Supplementary Table 1. The majority of included studies were convenience or case-control studies, or among women with HPV16 infection only, and many of these studies had an overrepresentation of women with CIN2+. In 15 (35%) studies, histological verification was available for all women (i.e. colposcopy-directed biopsies were taken when indicated and random biopsies taken from women with normal colposcopy findings), and the remaining studies had histological verification only among women for whom biopsy was indicated following an abnormal colposcopy result (Supplementary Table 2).
Associations of individual DNA methylation markers with grade of cervical intraepithelial neoplasia grade (Analysis 1) DNA methylation increased with increasing grades of CIN for all DNA methylation markers ( Supplementary Fig. 1). Compared to women with ≤CIN1, women with CIN2 had an increased risk of being methylation positive by any of the seven markers (crude OR = 2.83, 95% CI: 2.01-4.00, I 2 = 63%; Supplementary Table 4). Women with CIN3 and ICC were at higher risk of being methylation-positive compared to women with ≤CIN1 (CIN3 vs. ≤CIN1: crude OR = 7.92, 95% CI: 6.10-10.29, I 2 = 43%; ICC vs. ≤CIN1: crude OR = 32.11, 95% CI: 22.51-45.79, I 2 = 0%). When restricting the analysis to women with CIN2 and CIN3 only, the risk of methylation positivity was higher among women with CIN3 compared to women with CIN2 (crude OR = 2.95, 95% CI: 2.03-4.27, I 2 = 71%). This association was observed for all genes, with the exception of MAL, MIR-124-2 and POU4F3, although there was a small number of studies included for these genes.
Performance of methylation markers relative to cytology and HPV16/18 genotyping (Analysis 3) In 11 studies, which compared the performance of DNA methylation and cervical cytology among HR-HPV-positive women, DNA methylation assays were marginally less sensitive for CIN2+ and CIN3+ detection compared to cytology ASCUS+ (DNA methylation versus ASCUS+: relative sensitivity = 0.81, 95% CI: 0. 63 ). Similarly, relative sensitivity of DNA methylation was lower and relative specificity was higher for CIN2+/CIN3+ when compared to a cytology cut-off of LSIL+, although there were fewer studies (Table 4).

DISCUSSION
This meta-analysis investigating the performance of DNA methylation of human genes and HPV virus for the detection of CIN2+       The sensitivity estimate derived based on 70% specificity is used in this analysis when multiple estimates are given and for all other studies, the sensitivity and specificity estimates as reported by authors are used b One study 36 provides estimates for two genes (EPB41L3 and SOX1); in this analysis only data for EPB41L3 are included so that data from the same population of women are considered only once; similarly for ref. 38 -data for PAX1 and not SOX1 are included c For analysis of case-control studies, ref. 10 was removed from the analysis; and for convenience studies ref. 49 was removed from the analysis to allow best model fit in stata d Among women with HPV16 infection; e 24 studies allowed standardization of threshold-level for methylation positivity, which corresponded to a specificity of 70% for CIN2+ either by providing the raw data or a ROC curve; 17 studies allowed estimation at a specificity of 50% for CIN2+; 19 studies allowed estimation at a specificity of 70% for CIN3+ e 14 studies allowed estimation at a specificity of 50% for CIN3+ f One study 32 provides estimates for two gene combinations (CADM1/MAL and MAL/MiR-124-2) and only data for MAL/MiR-124-2 combination were used in the analysis (as authors conclude this was the best combination of markers) Performance of DNA methylation assays for detection of high-grade. . . H Kelly et al. and CIN3+ indicates that DNA methylation of several human genes and HPV16 L1/L2 increased with increasing CIN grade, with significantly higher methylation in CIN3 compared to CIN2, and almost universally high methylation in ICC, confirming the relevance of these markers as potentially useful in the screening and triage settings for the most advanced lesions. At an expected CIN2+ and CIN3+ prevalence of~30 and 20%, equivalent for example to a referral-population of women with a HR-HPV-positive test, 27,33 DNA methylation assays had marginally lower sensitivity for CIN2+ detection and higher specificity compared to cytology (ASCUS+ or LSIL+) and higher sensitivity compared to HPV16/18 genotyping for a similar specificity. Although there were too few studies to conduct a discrete meta-analysis, the S5 classifier had higher sensitivity for CIN2+ detection compared to the human gene EPB41L3 alone without compromising specificity, 33,42 suggesting the combination of viral and host cell targets may improve accuracy to detect CIN2+. Future studies may evaluate methylation of a wider range of HPV types found to be associated with CIN2+.
An optimal triage test should have high sensitivity to ensure women with confirmed high-grade lesions receive appropriate management and high PPV to ensure women who test positive are accurately targeted for management, avoiding overtreatment, associated costs and patient anxiety. In our review, DNA methylation assays generated a pooled PPV of 53 and 35% for CIN2+ and CIN3+, with corresponding sensitivity of 69 and 71%, respectively. These estimates are similar to that reported for cytology ASCUS+ among 535 HR-HPV-positive women enrolled in a population-based screening study in the Netherlands, 67 that reported PPV and sensitivity of 60 and 63%, respectively, for CIN2+ and 42 and 71%, respectively, for CIN3+. The corresponding estimates for HPV16/18 genotyping were 38 and 59%, respectively, for CIN2+, and 26 and 65%, respectively, for CIN3+. Among 614 HR-HPV-positive women participating in the Canadian Cervical Cancer Screening Trial, 68 cytology ASCUS+ had a lower PPV of 30% with a low sensitivity (48%) for CIN2+, while HPV16/18 genotyping had a PPV and sensitivity of 32 and 64%, respectively. There were too few prospective studies evaluating DNA methylation markers to conclusively assess their potential as predictors of future or progressing cervical lesions. However, three recent studies highlight their potential usefulness in that regard. A longitudinal study among 1040 HPVpositive women enrolled in the POBASCAM screening trial in the Netherlands reported that, compared to a cytology negative (<ASCUS) result at enrolment, a negative FAM19A4/MIR124-2 methylation test indicated lower risk of cervical cancer incidence over a 14-year follow-up period (Risk Ratio = 0.74, 95% CI: 0. 16-1.40). 69 In a cohort of women living with HIV in South Africa, participants with persistent CIN3, or CIN2 which progressed to CIN3, had significantly higher baseline EPB41L3 methylation levels compared to women who remained ≤CIN1 over 16 months, and compared to women with spontaneous regression to ≤CIN1. 25 In a study among 149 women with CIN2 that were followed over 2 years in Finland, the S5-classifier had the highest sensitivity to predict CIN2 lesions that progressed to CIN3 from those that spontaneously regressed to ≤CIN1 compared to cytology (using various cut points), HPV16/18 or HPV16/18/31/33 genotyping. 70 In comparison to other triage tests such as cytology and p16 INK4A staining, the advantages afforded by DNA methylation assays are that their molecular basis makes them automatable and less prone to training and interpretational errors than the morphological tests. Testing can be performed using the same clinician-collected or self-collected sample used for HPV screening, 30 thereby simplifying sample collection. Methylation could therefore become a useful alternative to cytology as a triage test among HR-HPV-positive women. Moreover, methylation assays provide an advantage over HPV16/18 genotyping as they are not restricted to detection of CIN2+ associated with HPV16/18 only, combining a higher sensitivity for CIN2+ with a similar specificity. While current methylation technologies may not yet be suitable for low-resource settings, technological advances may allow for use in such settings in the not too distant future.
There were too few studies in our review allowing for an evaluation of DNA methylation assays as a primary screening test. However, eleven studies evaluating human genes DNA methylation assays among populations with high HR-HPV prevalence have shown that these assays had higher specificity compared to primary HPV DNA screening, albeit with lower sensitivity. Assays combining human genes and HPV viral methylation may therefore increase sensitivity for CIN2+ detection while retaining high specificity, a useful feature in populations with high prevalence of HR-HPV. Given the potential for self-sampling, this approach may allow for a one-sample one-visit screening, which would reduce the loss-to-follow-up of women in many low-resource settings where HR-HPV prevalence may be high and where access to screening may be limited, allowing the number of screening visits in a woman's lifetime to be reduced. It is important, however, that any recommendations for inclusion of methylation tests in screening or triage will have to consider affordability, costeffectiveness and ease of management.
There were a number of limitations to our review. Firstly, there was significant heterogeneity in the pooled performance estimates,  which may be explained by any of the following: (1) variability in study designs; (2) variability in the proportion of CIN2+ cases included in each study; (3) differences in the target genes and CpG sites studied and (4) variation in thresholds used to define methylation positivity. We sought to limit the effects of these variations in our analysis. We stratified performance estimates by study design to distinguish the performance of DNA methylation in studies that are in an early discovery phase (i.e. mostly convenience and case-control studies) from those studies focused on defining the performance of these markers for screening or triage in referralpopulation-based and cohort studies. In order to adjust for differences in methylation threshold levels, we derived pooled sensitivity from those studies that allowed us to set specificity at 70%. Where possible, we obtained pooled sensitivity for individual target genes that revealed differences in sensitivity, with higher sensitivity achieved with combination markers compared to individual genes. Because PPV estimates correlate with prevalence of disease, we observed heterogeneity in the PPV estimates, largely due to the variability in the proportion of CIN2 cases included in each study. We controlled for this variability by generating a pooled PPV for different fixed levels of CIN2+ and CIN3+. We assumed no change in performance of DNA methylation assays with increasing prevalence of CIN2+, although future studies may demonstrate changes in sensitivity and/or specificity for CIN2+, depending on gene target as we currently see for HR-HPV DNA-based tests. Second, this review may have some selection bias, as we limited ourselves to include the most widely studied target genes, and a minimum number of reports for each gene. There was clear overrepresentation of women enrolled in large studies in the Netherlands (PROHTECT and POBASCAM) and the UK (PREDICTORS-1 and -2), as these groups have been most active in this particular field. The associations of individual gene marker methylation with increasing CIN grades is limited by the low number of studies for several gene targets included in the analysis, and we were unable to present adjusted estimates. Finally, not all studies (35% of studies only) had histological endpoints for all women included in the analysis, as biopsy indication was often based on colposcopy findings, leading to some disease misclassification linked to the variable sensitivity of cytology and colposcopy. 71 In conclusion, DNA methylation assays show promise for the detection of CIN2+ in triage situations, combined with existing screening tools with high sensitivity but lower specificity, such as HPV DNA tests. Methylation could be a useful alternative to cytology as a triage test among HR-HPV-positive women, given their similar performance with the added advantages of objectivity, automation and self-collected sampling. Despite an increasing number of studies in recent years evaluating different gene targets, the strength of current evidence remains low, and randomised controlled trials and further large prospective studies following guidelines on rigorous biomarker evaluation 72 are needed.

AUTHOR CONTRIBUTIONS
H.K., P.M. and A.L. conceptualised the study and developed the research protocol; H. K. and A.L. identified articles for full-text review; H.K. and A.L. extracted data from studies that matched inclusion criteria; H.K. and Y.B.M. did the statistical analyses; All contributed to the writing of the manuscript.
Competing interests: The authors declare no competing interests.
Ethics approval and consent to participate: This systematic review and metaanalysis used previous published data and did not use any unpublished data. As such, ethical approval to conduct the analysis was not sought. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.