Detection of colorectal cancer in urine using DNA methylation analysis

Colorectal cancer (CRC) is the second leading cause for cancer-related death globally. Clinically, there is an urgent need for non-invasive CRC detection. This study assessed the feasibility of CRC detection by analysis of tumor-derived methylated DNA fragments in urine. Urine samples, including both unfractioned and supernatant urine fractions, of 92 CRC patients and 63 healthy volunteers were analyzed for DNA methylation levels of 6 CRC-associated markers (SEPT9, TMEFF2, SDC2, NDRG4, VIM and ALX4). Optimal marker panels were determined by two statistical approaches. Methylation levels of SEPT9 were significantly increased in urine supernatant of CRC patients compared to controls (p < 0.0001). Methylation analysis in unfractioned urine appeared inaccurate. Following multivariate logistic regression and classification and regression tree analysis, a marker panel consisting of SEPT9 and SDC2 was able to detect up to 70% of CRC cases in urine supernatant at 86% specificity. First evidence is provided for CRC detection in urine by SEPT9 methylation analysis, which combined with SDC2 allows for an optimal differentiation between CRC patients and controls. Urine therefore provides a promising liquid biopsy for non-invasive CRC detection.

www.nature.com/scientificreports/ epigenetic regulator of gene expression, frequently altered in cancer 13 . Increased methylation in CpG dense regions, called CpG islands and located in promoter regions of tumor suppressor genes, can lead to inactivation of tumor suppressor genes. As this is believed to be an early and critical event in CRC development, analysis of ctDNA methylation has potential to serve as a biomarker for CRC 3,[14][15][16] . Circulating tumor DNA is excreted in urine through glomerular filtration, which propagates urine analysis as a true non-invasive method of cancer detection [17][18][19] . Despite advances in ctDNA diagnostics in blood, the process of venipuncture remains invasive and many challenges still have to be tackled. Analysis of ctDNA in urine features a non-invasive and logistically attractive way of testing.
Urinary cfDNA can roughly be divided into two groups based on fragment size, comprising high-and low molecular weight (MW) groups. The high-MW group consists of heterogeneous DNA fragments of 1 kilobasepair (kbp) and larger, typically originating from the cell debris of the urogenital tract 20,21 . The low-MW group consists of smaller DNA fragments between 150 and 250 bp. Presumably, the low-MW fraction of urinary DNA is partially derived from the blood circulation, allowing detection of ctDNA in urine samples 21,22 . One method by which urine samples can be enriched for low-MW DNA is by centrifugation, which partly separates potential tumor DNA from non-specific high-MW DNA 18 . In contrast to, for example bladder cancer, in which high-MW DNA present in urinary sediment is suitable for tumor DNA detection 23 , the supernatant fraction is expected to be of most interest for detection of non-urogenital tumors, such as colon cancer, in urine.
In this study, we aimed to evaluate the diagnostic potential of urine DNA methylation analysis for detection of CRC.

Material and methods
Study subjects and sample processing. Consecutive patients suffering from CRC who visited the Surgery Department Amsterdam UMC, a tertiary referral center in, Amsterdam, The Netherlands between January 2018 and February 2019, were included in the study. All patients were older than 18 years, were diagnosed with a pathology proven CRC, and underwent no recent anticancer treatment during the last year. Patients with other malignancies in the previous 3 years were excluded. All participating patients provided urine samples during visits to the outpatient clinic prior to surgery. Samples were collected in 40 ml containers containing 40 mM Ethylenediaminetetraacetic acid (EDTA) and subsequently processed within 6 h and stored at 4 °C. Both addition of EDTA and storage at 4 °C preserves urine DNA for accurate methylation analysis 24 . Healthy volunteers, serving as controls, were selected for eligibility through a pre-defined selection process. By taking a questionnaire, it was verified if control subjects were not diagnosed with cancer at any point during their life, and matched the age range of the CRC patient test groups. Urine samples from controls were also collected in containers containing 40 mM EDTA and processed upon arrival. For both CRC patients and controls an independent set of consecutively collected urine samples was used for our studies on unfractioned urine and urine supernatant. To obtain the urine supernatant fraction and enrich for low-MW, DNA samples were centrifuged at 3000 g for 15 min. Unfractioned and supernatant urine specimens were frozen at − 20 °C until further use.
Sample collection and study design were approved by the Medical Ethical Committee board of the Amsterdam UMC for both CRC patients and healthy volunteers (no. 2018.035 and no. 2018.657). Written informed consent was obtained from all participants of this study. All experiments were performed in accordance with relevant guidelines and regulations. DNA isolation and bisulfite modification. For DNA isolation from unfractioned urine and urine supernatant the Quick DNA urine kit (Zymo Research, Irvine, CA, US) was used. This isolation method proved superior for the isolation of small DNA fragments (as low as 50 bp) over other DNA isolation methods (data not shown). DNA from the CRC cell line RKO (American Type Culture Collection) was isolated using the PureLink genomic DNA kit (Invitrogen, Waltham, MA, US).
Isolated DNA was eluted in 50ul elution buffer. DNA concentrations were measured using the Qubit™ dsDNA HS Assay (Invitrogen, Carlsbad, CA, US). For methylation analysis, up to 400 ng of isolated DNA was treated with bisulfite using the EZ DNA Methylation kit (Zymo Research, Irvine, CA, US). All procedures were performed according to manufacturer's guidelines.
Two multiplex quantitative Methylation Specific PCRs (qMSPs), each consisting of 3 targets (SEPT9, TMEFF2 & SDC2 and NDRG4, VIM,& ALX4) and reference gene (β-actin: ACTB) were designed based on sequences as described previously [25][26][27][28][29] . By adjusting amplicon sizes to a maximum of 80 bp, detection of CRC-derived low-MW DNA was facilitated. Multiplex development was executed according to optimization parameters as described by Snellenberg et al. 30 . In brief, marker specificity was individually evaluated by MSP with unmodified and modified DNA isolated from CRC cell line RKO. Sensitivity analysis demonstrated that both qMSP multiplexes were able to detect methylated RKO DNA diluted in water up to dilutions of 0.1% and 0.5%, respectively. Primer and probe limiting assays were performed to determine their ideal concentrations in both singleplex and multiplex qMSP's. qMSP analysis was performed on a ViiA7 real-time PCR-system (ThermoFisher Scientific, Waltham, MA, USA), using Epitect Multiplex PCR Mastermix (Qiagen, Venlo, Netherlands) Methylation marker abundance was calculated relative to ACTB levels (Ct-ratio), using the following formula: 2 − (CtMARKER − CtACTB) * 100. Further details are provided in supplementary S1. www.nature.com/scientificreports/ Data analysis. Ct-ratios of methylation targets were compared between groups using the Mann Whitney U test. Results from statistical tests were corrected for multiple testing by the Bonferroni-procedure. Differences in absolute detection rates were compared and tested for statistical significance with the Pearson's Chi-square test. p values < 0.05, adjusted using Bonferroni correction, were considered to be statistically significant. Analyses of relationships between methylation and clinical parameters were only performed for marker SEPT9, since all other markers did not have sufficient data points for additional statistical analysis. In the patient group, SEPT9 was compared to cancer stage by the Kruskal Wallis test. Due a large proportion of stage IV patients in our study having only peritoneal metastases (68%), results of stage IV patients were split up between peritoneal metastasis solely and stage IV including all types of (including hematogenous) metastasis. In this study these were only liver metastases.
To determine the ability of a combination of markers to differentiate between controls and patients, two approaches were explored for determining both the best marker panel and marker thresholds for a maximal test accuracy. In the first method, multivariate logistics regression (MLR) was used to model the probability of a urine sample being from a CRC patient, with all six methylation markers as independent variables. First, we fit a model with the six main effects only, and selected markers by stepwise selection. Then, to investigate whether the in-model effect of an individual marker was affected by other markers, we added the two-way interaction terms that include the selected main effects, again followed by stepwise selection. A leave-one-out cross-validation was then used to evaluate the performance of the model for prediction. Next, the predicted probability from this cross-validation was used for sample classification, according to a maximal Youden's index, i.e. the sum of sensitivity and specificity minus 1. For fitting the MLR model, the R function Generalized Linear Models or glm was used. Apart from the MLR, we applied an algorithm-based method called classification and regression tree (CART) for binary classification of cases and controls on the same set of methylation markers. For this alternative analysis, a decision tree was obtained allowing for classification of urine samples based on marker values. We refer to 31 for further details on CART method. For the purpose of prediction, the predicted class was obtained by leave-one-out cross validation. For both building the decision tree as well as performing prediction, the R package Recursive Partitioning or rpart was used.
The performance of both methods was determined from obtained sensitivities and specificities. For logistics regression, the Receiver Operator Characteristic (ROC) curve was plotted together with maximized Youden's index.
Statistical analyses were performed using SPSS software (SPSS 22.0, IBM, Armonk, NY, USA) and R (Vienna, Austria. UR). Data visualization and construction of graphs was facilitated by GraphPad (Graphpad Prism version 8.2.1, La Jolla, CA USA). Additional details of all statistical analyses can be found in supplementary file S1.

Results
Patient and sample characteristics. In total 47 CRC patients and 20 healthy controls were included in the unfractioned group, and 45 CRC patients and 43 controls in the supernatant group. Clinical characteristics of CRC patients and controls with valid qMSP results are depicted in Table 1.
The DNA yield of both unfractioned urine and urine supernatant collected from CRC patients and controls were evaluated to assess the utility for methylation analysis. The sample DNA concentrations are shown in Table 2. Concentrations of unfractioned urine samples were approximately three to five times higher as compared to supernatant samples, for patients and controls. Regarding sex, unfractioned urine DNA concentrations of female subjects were two times higher than DNA concentrations measured in male subjects. DNA methylation detection rates in unfractioned urines samples. DNA methylation of SEPT9, TMEFF2, SDC2, NDRG4, VIM and ALX4 in unfractioned urine samples of CRC patients (n = 47) and controls (n = 20) was investigated to evaluate their potential for CRC detection. Elevated methylation levels were detected in a subset of CRC patients for SEPT9, and at low frequencies for VIM and ALX4 (Fig. 1a). Following Bonferroni correction, none of the markers was found to be significantly different between patients and controls. Likewise, no significant differences were seen for methylation detection rates, defined as any positive signal in qMSP analysis (i.e. Ct value < 45) (Fig. 1b). Methylation marker SEPT9 was detected in all CRC patients as well as nearly all controls (90%). The remaining markers were detectable in 2-36% of CRC patients and 0-20% of controls. DNA methylation detection rates in urine supernatant samples. Next we determined if the supernatant fraction, which is presumed to be enriched for cfDNA 18 , would allow for a better discrimination between patients and controls. The same methylation markers were tested on urine supernatants from an independent cohort of CRC patients (n = 45) and controls (n = 44). As shown in Fig. 1c, SEPT9 methylation levels were significantly elevated in CRC patients compared to controls (p < 0.0001). No significant differences were found for the remaining five markers. Assessment of the absolute detections rates also demonstrated that SEPT9 methylation analysis detected significantly more CRC patients than controls (p < 0.01) (Fig. 1d). No differences in detection rates between the two groups were found for the other five markers.
In the group of CRC patients, no difference in SEPT9 methylation levels was found between the different clinical stages of CRC disease (Fig. 2). However, within the stage IV patients, SEPT9 methylation levels were significantly increased in patients of which the primary tumor was still present during urine collection, compared to stage IV patients having a history of resection of the primary CRC tumor (p < 0.01). Patients with solely peritoneal metastases had a trend towards lower levels of urine ctDNA, as compared to patients with liver metastases.
Discriminating potential of combined methylation markers in urine supernatant. The  www.nature.com/scientificreports/ controls and CRC patients. Both a multivariate logistics regression (MLR) and classification and regression tree (CART) analysis methods were used to assess accuracy. Furthermore, discrepancies were determined between models with regard to sample classification. To allow for a complete analysis, CRC samples (n = 2) and control samples (n = 1) that had an invalid ACTB in one of the two multiplexes, were discarded in this process which resulted in a total of 43 CRC and 42 control urine supernatants to be evaluated.
When fitting the MLR model with main effects, only SEPT9 was significantly associated with the probability of being in the case group (p < 0.0001). Therefore, a logistics regression was fitted, with SEPT9 and the interaction terms with the other five markers. The stepwise selection procedures selected SEPT9 methylation and the interaction term between SEPT9 and SDC2 methylation to be strongly associated with the probability of being a CRC patient. Figure S2 illustrates the behavior of the estimated probability of being a CRC case for various values of SEPT9 and SDC2. In this model, when methylation levels of SEPT9 were low, the probability of being a case was small, irrespective of SDC2 levels. In the higher values of SEPT9 however, we noticed that gradual increases of predicted probabilities of being a case were affected by the values of SDC2. This explains the interactive effect of SEPT9 and SDC2.
In the second approach, we used a CART model to classify cases and controls based on the values of each markers. The resulting decision tree is depicted in Fig. 3. As in the logistics regression, SEPT9 is the most important predictor for the classification, but again SDC2 constitutes an interaction variable. When SEPT9 methylation  www.nature.com/scientificreports/ was higher or equal to -0.098, subjects were classified into cases (Fig. 3, node 3). In node 3, there are 29 correctly classified and 4 misclassified subjects. In the next step, subjects were classified into controls when having SEPT9 methylation lower than -2.9. In this branch, 27 subjects were correctly classified and 5 subjects were misclassified (node 4). Finally, when the value of SEPT9 was lower than -2.9, the threshold of SDC2 = − 1.8 determined the controls (i.e. SDC2 ≤ − 1.8, with 9 correctly classified and 4 misclassified subjects) and cases (i.e. SDC2 > − 1.8, with 5 correctly and 2 misclassified subjects).
Finally, leave-one-out cross validation was utilized to evaluate the prediction performance of MLR and CART. Using the MLR model, we obtained the estimated probability of being a case while the CART decision tree assigns a sample as case or control ( Figure S4). Hence, unlike CART decision tree, logistic regression allowed drawing a receiver operating characteristic (ROC)-curve (Fig. 4). A maximized Youden's index was used to compare the performances of both methods. Figure 5 illustrates the performance of both models. In general, the performance of both MLR with interaction and CART were almost similar. While the MLR provided slightly higher sensitivity compared to CART (70% vs 67%), the latter had a slightly better specificity (88% vs 86%). Furthermore, the two www.nature.com/scientificreports/ models agreed on the classification of > 90% of all samples (Fig. 5). Both models were also able to detect CRC independent of cancer stage.

Discussion
This study demonstrates for the first time that urine of CRC patients contains elevated levels of the DNA methylation marker SEPT9, as compared to healthy control patients. SEPT9 methylation, combined with marker SDC2, offers a potential novel tool for detection and monitoring of CRC. Using short-amplicon methylation specific PCRs, we have successfully detected CRC-associated DNA methylation in urine supernatant. Out of six markers tested, SEPT9 showed best accuracy to serve as a potential urinary biomarker for CRC detection. By combining SEPT9 and SDC2, up to 70% of CRC cases could be detected at a specificity of 86%. Despite extensive research, a need still exists for a non-invasive biomarker to detect CRC during clinical management. While many studies have been performed on the use of ctDNA in plasma for these purposes, a possible role for urine has not yet been well elucidated. Urine as a biofluid poses several advantages over blood.  www.nature.com/scientificreports/ It does not require trained professionals to acquire, it lends itself for easy repeated sampling and it has been shown that urine poses a very stable medium for DNA 24 . This allows for reliable testing of samples collected in an ambulant setting. Additionally, there are no limits to available quantities. For screening programs of other types of cancer, urine is currently evaluated as a non-invasive alternative to physician-involved diagnostics [32][33][34][35] . The results from the present study suggest that urine has the same potential for CRC detection, by showing ctDNA is detectable through means of DNA methylation analysis. This report is among the first few publications exploring the feasibility of molecular analysis in urine for the purpose of non-invasive CRC detection. In a pioneer study by Su et al., the distinction between high and low-molecular weight (HMW and LMW) urine DNA for CRC detection was made, showing the latter provides higher accuracy for detection of CRC-specific KRAS DNA mutations 22 . Some methylation markers tested in the present study have been described before for CRC detection in urine samples. Methylation marker VIM was assessed in two separate studies 28,36 . In a study that selected low molecular weight DNA from urine samples using carboxylated magnetic beads, 12 of 17 LMW (71%) urine DNA samples of CRC patients were found to be positive for VIM methylation, compared to two out of 20 (10%) control samples 28 . Another earlier publication however, showed a poor performance of VIM methylation detection in urine (i.e. 8% sensitivity at 100% specificity). This study also assessed the methylation markers WIF-1 and ALX4 in urine, for which respectively a sensitivity of 27% and 15% at 99% and 100% specificity was found 36 . Detection of NDRG4 methylation in urine has been described earlier by Xiao et al. with 55 of 76 (73%) CRC cases testing positive for urine methylation, at a specificity of 85% based on 36 controls 29 .
Differences in methodology or sample population may explain the discrepancies in CRC detection rates of methylation markers NDRG4 and VIM. In the present study, centrifugation appeared effective for enrichment of highly fragmented tumor DNA (low-MW), as supernatant samples enabled a more adequate differentiation between disease and healthy controls compared to unfractioned urine that contains both high-MW and low-MW  www.nature.com/scientificreports/ DNA. Furthermore, we used a dedicated urine DNA isolation kit. Interestingly, all mentioned previous studies did not fractionate urine samples prior to DNA isolation and did not use specialized urine DNA kits. Song et al. stored urine samples on − 70 °C directly following collection, and isolated for low-MW DNA using magnetic beads-based selection and resin-based DNA isolation after defrosting of the urine samples 28 . Xiao et al. isolated DNA directly following collection, however no pre-PCR DNA size selection was performed and no information was given on the isolation method 29 . Amiot et al. did not provide any details with regard to urine processing and used the same generic DNA isolation kit for urine, plasma and stool 36 . Concerning differences in sample population, Song et al. collected control samples from patients that underwent a colonoscopy yielding negative results. The other studies did not provide further information on their controls. Furthermore, differences in test group ethnicity might have influenced baseline methylation levels 37 . Another important reason of discrepancies in marker performance between present and earlier studies, could be differences in the actual targeted marker CpG dinucleotides undergoing PCR amplification.
Besides limited data on the use of urine for detection of CRC, little evidence is currently available for prognostication and disease monitoring. By detection of KRAS mutations in urine, only Fuji et al. explored the modality of advanced CRC treatment monitoring, showing a decline of urinal/urinary KRAS mutations during effective systemic therapy 38 .
To our knowledge, this is the first study describing detection of SEPT9 methylation in urine. Urine SEPT9 methylation analysis showed a sensitivity for CRC detection coming close to those reported for SEPT9 methylation analysis in plasma samples, which vary from 75 to 81% (at ≥ 96% specificity) 39 . SEPT9 methylation analysis for CRC detection in plasma is now available as an FDA-approved commercial test (Epi ProColon 2.0, Epigenomics AG Coporation, Berlin, Germany). Established CRC plasma methylation markers tested in this study, other than SEPT9, showed lower detection rates. Possible explanations include both biological and technical causes. Urine as a biofluid might have properties that lead to decreased detection of certain ctDNA fragments. Pores present in the glomerular basal membrane (GBM) may select not only on molecular weight, but also its net negative electric charge could play a role in preventing blood-urine translocation of the negatively charged DNA 19,40 . Other thermodynamic properties of DNA that lead to its polymorphic potential, as well as complex formation with for instance proteins, might also influence the probability of glomerular translocation. As these properties are hugely influenced by particular nucleotide sequences of ctDNA, certain methylation markers might have a decreased performance in urine. A comparison between blood and urine samples from the same test subjects would therefore be very interesting to estimate the rate by which the methylation signal gets lost due to GBM filtration.
No differences were found between urine SEPT9 methylation levels and clinical stages of CRC of patients. Although the detection of ctDNA appears more likely when sampling occurs during higher clinical stages of neoplastic disease, this was not the case in the present study. Similarly, the study on VIM methylation in CRC urine samples did also find no correlation with CRC stages 28 . For stage IV patients however, we found that the presence of the primary tumor may be an attributive factor for detecting CRC-associated methylation in urine DNA. This probably also relates to the fact that the majority of included stage IV patients without the primary tumor present were suffering from peritoneal metastases. Data of our group shows that patients with peritoneal metastases have a smaller tendency to have detectable ctDNA 41 . Furthermore, the majority of peritoneal metastases are classified as CMS4 (Consensus Molecular Subtypes). In this tumor subtype, methylation levels are often very low 42,43 .
A novel approach in this study is the use of two different methods of statistical analysis to determine the complementarity between methylation markers to achieve maximal accuracy for CRC detection. The MLR and CART models agreed on classification of most urine samples, supporting the validity of our results. The MLR model gave a slightly higher sensitivity (70%) compared to the decision tree (67%), whereas the CART model yielded a slightly higher specificity (MLR: 86%, CART: 88%). In this study, the MLR model was optimized to achieve a maximal Youden's index but could, depending on the clinical context, be adjusted to achieve a higher sensitivity or specificity. Furthermore, it should be noted that especially the CART model provides practical means to evaluate an individual's test results, by simply noticing the marker value and subsequently following the branches of the tree. The MLR model on the other hand provides a probability of being a case based on methylation marker values. Therefore, this model is more generalizable to the whole population compared to the decision tree. Hence, both methods can be used depending on the requirements that follow the specific clinical demand.
A limitation of this feasibility study includes the relatively small set of samples. Also, the control group of the urine supernatant cohort included somewhat younger subjects and relatively more females than in the patient group (Table 1). Although we aimed for completely aged-matched case and control groups and succeeded in the unfractioned urine cohort, there was a small age difference in the urine supernatant cohort (median age 66 vs 60). Furthermore, no patients with CRC precursor lesions or other cancer types were included, which would better define a potential role of urine in cancer screening. In this study, markers were based on a systematic literature review on blood markers for CRC detection 3 . Genome wide methylation analysis of CRC urine specimens could possibly discover novel markers with better performance in urinary cfDNA. Furthermore, in this study, 40 ml of urine was used for DNA isolation, being a technical limitation of methodology. Other studies studying urine for cancer detection have utilized larger volumes (up to 120 ml) of urine 38,44 . Methylation analysis with higher inputs of urine for DNA isolation might increase test accuracy and reduce test failures.
The promising results of the present study warrant further verification and validation studies, comprising larger samples series and application of pre-defined thresholds of the models used in this study. Depending on the designated clinical setting however, a validation study might require thousands of test subjects 45 . Recent large-scale efforts to screen for cancer in a population setting, have shown DNA methylation to be particularly suited for cancer detection and tissue of origin localization 46  www.nature.com/scientificreports/ attractive to combine methylation analysis with additional molecular ctDNA markers, such as DNA mutations and/or copy number variations 47 or with the analysis of CRC-associated metabolites 48 .
In conclusion, this study demonstrates the feasibility of urine supernatant for detection of CRC. Through means of DNA methylation analysis of a marker panel consisting of SEPT9 and SDC2, CRC could be detected with high accuracy. This is the first step in conceiving a urine-based test for CRC and ultimately, other cancers as well.

Data availability
The data generated and analyzed during the current study are available from the corresponding author on reasonable request. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.