Introduction

Colorectal cancer is the fourth most common cancer in the UK. In the last few decades, there has been a steady increase in incidence within developed countries, with the UK now seeing around 35,000 cases a year. Mortality increases with stage, and collectively, colorectal cancer is responsible for 10% of all cancer deaths in the UK [1, 2]. Definitive treatment involves surgical resection, aided by perioperative chemotherapy [3]. Identification of patients who will benefit from adjuvant chemotherapy remains a dilemma, particularly in stage II disease [4].

Minimal residual disease (MRD) is defined as microscopic neoplastic material remaining after curative treatment not detectable clinically [5], and thus holds the potential to precipitate disease relapse. Recently, there has been much interest in the ability of circulating tumour DNA (ctDNA) for detection of MRD and prognostication following curative treatment including surgical resection and radical chemoradiotherapy.

This ctDNA is released from dying cancer cells and is found in varying proportions amongst cell-free DNA (cfDNA) released following the death of normal circulating blood cells. It is released during apoptosis and necrosis and has a half-life of around 2ā€‰h [6, 7]. The concept of utilising circulating tumour-derived material to provide diagnostic information on cancer has been coined ā€˜liquid biopsyā€™ [8]. The liquid biopsy has many potential advantages over the traditional surgical biopsy. It is minimally invasive and amenable to repeat measurements over time. Liquid biopsies could overcome the spatial limitation of tissue biopsies with variations in genetic profiles seen within the tumour itself and between metastases [9, 10], and could theoretically provide a more complete picture of the molecular profile.

Despite the promise ctDNA holds, there are still a number of limitations. ctDNA comprises only a minor proportion of total cfDNA, thus sensitive methods are required for detection [11]. Clonal haematopoiesis of indeterminate potential (CHIP) are non-tumour derived somatic mutations in haemopoietic cells which can bring the possibility of false positive results [8]. There are two main approaches to ctDNA analysis. Initially measurement relied on PCR-based techniques targeting a few loci. This focused approach is quick and relatively inexpensive. The ability to detect very low variant allele frequencies (VAF) brings high sensitivity, with digital-PCR and BEAMing techniques able to detect VAFs as low as 0.01% [12]. However, PCR-based techniques rely on prior knowledge of the genetic profile of the cancer and have limited capabilities for multiplexing [7]. More recently, the development of next generation sequencing (NGS) has enabled analysis of a much wider panel of target genes and enables screening for unknown variants [7, 13]. There is a growing interest in the characteristics of ctDNA beyond the somatic mutations, including methylation and fragmentation patterns [14].

At present there remains an urgent clinical need for a better post-operative risk stratification paradigm in colorectal cancer, with current tumour markers lacking sensitivity and rising late following disease recurrence [15, 16]. It has been acknowledged that ctDNA holds great potential for this application, evidenced in a number of other primary cancer sites including pancreatic [17], lung [18] and breast [19] cancer, yet there remains little consensus on the validity of this approach in colorectal cancer compounded by a lack of systematic evidence. This systematic review examines the utility of post-surgical ctDNA for detecting MRD following curative surgery in colorectal cancer, and compares study methodologies to facilitate recommendations for optimal study design for future research and integration into clinical practice.

Methods

Search strategy and study selection

An electronic search of MEDLINE, EMBASE and the Cochrane Library was conducted in July 2021. There was no restriction by language and no limits were applied to the search. The search strategy is available in Supplementary Material. The protocol was registered on PROSPERO (CRD42021261569). Study selection, data extraction and quality assessment were performed in duplicate with two authors (LF and LH) working independently. Disagreements were resolved by discussion between authors. All abstracts identified by the search strategy were screened and potentially eligible manuscripts were then reviewed. Study authors were contacted where relevant outcome data was missing from manuscripts.

In order for inclusion, studies had to meet the following prespecified criteria: [1] Participants had to be diagnosed with colorectal cancer and undergoing curative surgical resection. [2] Post-operative ctDNA measurement was performed. [3] Participant follow-up had to be such that long-term outcomes could be assessed.

Surgical procedures on primary colorectal cancer, local recurrences and metastasectomies were included, provided they were carried out with curative intent. The post-operative ctDNA measurement could be carried out at any timepoint post-operatively provided this measurement was then correlated with long-term outcomes. Any length of follow-up were considered provided time to relapse or death were measured during this time. Studies were excluded if the manuscript could not be obtained from the British Library or were not available in English. Unpublished work was not included. We accepted any study design, however case report and reviews were not included. There was no restriction by publication date or sample size.

Data extraction

Data extraction was conducted in accordance with the following criteria: study characteristics (author, date of publication, country); study design (sample size, prospective/retrospective, follow-up time); participant baseline characteristics (age, gender, site, stage, neoadjuvant/adjuvant chemotherapy); ctDNA methodology (timing of samples, assay, gene panel, limit of detection, cut-off value).

At present there is no gold-standard method of detection of MRD, so long-term outcomes were used as surrogate markers, with the hypothesis that those with undetected residual disease will have a higher propensity to relapse. The outcomes collected were the proportion of subjects classified as ctDNA-positive at the first liquid biopsy after surgery, the proportion of participants who relapsed in each group, median progression-free survival (PFS), median overall survival (OS) and the corresponding hazard ratios (HRs) confidence intervals and p values.

Quality assessment

A quality assessment form was designed by considering relevant aspects from each domain in the ROBINS-I risk of bias tool [20]. This generated a ten-point scale. The mapping of each question to the domains of bias according to the ROBINS-I tool are shown in Supplementary TableĀ 1. For each criterion, studies could be graded as ā€˜low riskā€™, ā€˜high riskā€™ or ā€˜unsureā€™. Each study was then scored out of 11, with the final score incorporating study timeline (i.e. prospective/retrospective). We also collected information on centre number, sample size and statistical adjustment.

Both the data extraction form and quality assessment form were pre-piloted and can be found in the supplementary material.

Data synthesis

A meta-analysis was conducted combining the HRs for PFS of ctDNA-positive vs ctDNA-negative groups. HR were pooled by inverse variance using the overall estimated HR and standard error of individual studies, either from data presented in the manuscripts or from a Cox proportional-hazards model from individual participant data available provided as a supplement or obtained directly from the study authors. Heterogeneity was quantified with the I2 statistical test and a random-effect model was used in the presence of significant heterogeneity (pā€‰<ā€‰0.05 or I2ā€‰ā‰„ā€‰50%). Subgroup analysis was performed according to disease extent (primary resection vs metastasectomy), adjuvant chemotherapy and assay type (NGS vs PCR), as pre-planned. Results were displayed in Forest plots. Publication bias was assessed by Funnel plot to assess for asymmetry.

This review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [21] and the Meta-analysis of Observational Studies in Epidemiology (MOOSE) guidelines [22]. Statistical analysis was performed on Review Manager (RevMan) Version 5.4, The Cochrane Collaboration (2020).

Reporting summary

Further information on research design is available in theĀ Nature Research Reporting Summary linked to this article.

Results

Search results

The search identified 3581 papers, after removal of duplicates. Full-text screening was performed for 147 studies, of which 37 studies were included involving 3002 patients (Fig.Ā 1) [23ā€“59]. Details of the key excluded studies can be found in Supplementary TableĀ 2.

Fig. 1: PRISMA flow diagram.
figure 1

Flow diagram describing the study selection process and number of studies at each stage according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines.

Included studies

Included studies incorporated all stages of colorectal cancer (Iā€“IV), with six specific to rectal cancer. On average, 42.2% of patients had rectal cancer and 34.8% exhibited right-sided disease. Articles were published between 1993 and 2021 and were conducted in continents including North America, Europe, Asia and Australasia. Surgical procedures included removal of the primary cancer, local recurrence and metastasectomy. Nine papers addressed metastasectomy alone (liver, peritoneal or lung) with a further two including metastasectomy sub-groups. The median age ranged from 55 to 73 and the proportion of male participants ranged from 33-90%, mean 53.2% (TableĀ 1). Out of 37 papers, only 16 (43.2%) reported the proportion of patients who received neoadjuvant chemotherapy (rangeā€‰=ā€‰0ā€“100% participants, mean: 43.6%), and 26 (70.3%) the proportion of patients receiving adjuvant chemotherapy (range: 0ā€“100%, mean: 63.5%). The most common regimen was 5FU-based, either alone or in combination with oxaliplatin. The median follow-up time of the study ranged from 11.7 months to 6.6 years (median 26.2). Post-operative monitoring protocols were described in 23 (62.2%) studies, consisting of physical examination, laboratory tumour markers (CEA, CA19.9) and radiology (TableĀ 1).

Table 1 Study characteristics.

Timing of the first post-operative ctDNA measurement varied from the day of surgery to 13 months post-surgery. PCR-based methods were used in 19 (51%) studies and 15 (31%) used NGS, with 3 studies monitoring epigenetic changes. Fourteen (38%) reported a limit of detection (LoD) of the assay and 31 (86%) specified a cut-off level to establish ctDNA positivity. There was little consensus on the gene panel breadth, with the number of genes evaluated ranging from 1 to 1021. In 16 studies, the mutations evaluated in ctDNA were based on those previously identified in tissue (15) or plasma (1). Within these, the size of the gene panel evaluated in the tumour ranged from 4 genes to whole-genome sequencing (WGS). ctDNA was also measured pre-operatively in 32 (86%) of the studies (TableĀ 2).

Table 2 Methodology.

Association of ctDNA with PFS

The proportion of participants classified as ctDNA-positive at the first liquid biopsy after surgery ranged from 0 to 90.9% (median 20%). In 3 studies, no patients had detectable ctDNA at the first liquid biopsy after surgery [23,24,25]. The proportion of patients who relapsed during follow-up was consistently higher in ctDNA positive participants concurrent with shorter median PFS (TableĀ 3). Time-to-event analysis for PFS according to post-operative ctDNA was available for 21 studies including 2645 participants. This included outcomes calculated from data available in the supplementary material [26] and data sent by the study authors [27]. Multivariate analysis had been performed in 15 studies and OS was assessed in 12 (Supplementary TableĀ 3). A shorter PFS associated with ctDNA-positivity was consistently observed, with HRs varying between 1.36 and 39.9. This was statistically significant in 19 studies via univariate analysis and in all multivariate analysis (TableĀ 3).

Table 3 Disease relapse.

Meta-analysis of PFS according to ctDNA

A meta-analysis confirmed poor prognosis associated with ctDNA detection post-operatively, which was found to be statistically significant [HR 6.92, CI 4.49ā€“10.64, pā€‰<ā€‰0.00001] (Fig.Ā 2). This effect was also seen in subgroup analysis according to adjuvant chemotherapy use [adjuvant chemotherapy HR 6.01, CI 2.96ā€“12.21, pā€‰<ā€‰0.00001, no adjuvant chemotherapy HR 10.3, CI 6.46ā€“16.45, pā€‰<ā€‰0.00001], disease extent [primary resection HR 7.93, CI 4.27ā€“14.75, pā€‰<ā€‰0.00001 metastasectomy HR 5.08, CI 2.85ā€“9.05, pā€‰<ā€‰0.00001] and assay type [NGS: HR 8.87, CI 5.93ā€“13, pā€‰<ā€‰0.00001; PCR: HR 5.37, CI 2.84ā€“10.16, pā€‰<ā€‰0.00001] (Fig.Ā 3). A meta-analysis was also performed where multivariate analysis was available [HR 5.73, CI 3.34ā€“9.84, pā€‰<ā€‰0.00001] (Supplementary Fig.Ā 1). Statistical testing demonstrated significant heterogeneity (pā€‰<ā€‰0.00001) with an I2 value of 77%, hence a random effects model was used. The funnel plots of effect size (HR) plotted against standard error showed asymmetry suggestive of publication bias (Fig.Ā 4).

Fig. 2: Forest plot showing meta-analysis for PFS according to post-operative ctDNA following surgery for colorectal cancer.
figure 2

Data displayed as HR with 95% confidence intervals on a logarithmic scale. HR hazard ratio, PFS progression-free survival, SE standard error.

Fig. 3: Subgroup analysis.
figure 3

Forest plot showing subgroup meta-analysis for PFS according to post-operative ctDNA according to disease extent, adjuvant chemotherapy and assay type: a resection of primary disease; b metastasectomy, c did not receive adjuvant chemotherapy; d received adjuvant chemotherapy; e NGS; f PCR data displayed as HR with 95% confidence intervals on a logarithmic scale. HR hazard ratio, NGS next-generation sequencing, PCR polymerase chain reaction, PFS progression-free survival.

Fig. 4: Funnel plot.
figure 4

Funnel plot to show effect size against standard error for HR of PFS according to ctDNA status. HR hazard ratio, PFS progression-free survival, SE standard error.

Association of ctDNA with OS

Hazard ratios comparing overall survival were available in five papers [28,29,30,31,32]. An association of poor prognosis with post-operative ctDNA detection was also seen on meta-analysis when comparing overall survival [HR 3.64, CI 1.63-8.12, pā€‰=ā€‰0.002] (Supplementary TableĀ 3 and Supplementary Fig.Ā 2).

Quality assessment

The total quality assessment score of included studies ranged from 7 to 11 out of 11 (Supplementary TableĀ 4). Patient baseline characteristics and ctDNA methodologies were generally well described. Most studies were conducted in single centres [32] and sample size calculations were rarely performed [4]. There were a number of studies with small sample sizes and inclusion of only a few participants; however, of those included in the meta-analysis the minimum sample size was 24 owing to the need for sufficient data for meaningful survival analysis in these studies.

Discussion

In this review, we demonstrate that ctDNA detection after curative surgery in colorectal cancer is associated with shorter time to disease relapse. This relationship was consistently demonstrated across multiple studies, and here we demonstrate for the first time that this effect is statistically significant when combined through a meta-analysis. The role of ctDNA as a marker of prognosis has previously been explored in Stage IV disease; a systematic review included four studies looking at resectable disease incorporating 123 patients. They report a ā€˜lead timeā€™ with ctDNA appearance and disease relapse compared to detection by imaging, but did not find a significant relationship between pre-surgery ctDNA and overall survival [33]. As far as we are aware this is the first meta-analysis combining survival analysis between ctDNA detection and long-term outcomes and is the first review examining this effect in resectable disease across all disease stages. Despite the large volume of research on this topic, there remains a lack of consensus on a number of practical aspects. This resulted in considerable variability between studies, introducing heterogeneity into the analysis and was the main limitation of this review.

Post-operative ctDNA measurement could influence clinical management at a number of points. Recognition of patients at low-risk of relapse would enable identification of individuals in whom adjuvant therapy was unnecessary, whereas ctDNA measurement after completion of adjuvant treatment could be used to determining the need for further treatment [34, 35]. ctDNA could also be incorporated into ā€˜watch and waitā€™ protocol in rectal cancer following complete response to neoadjuvant chemotherapy. Liquid biopsy could also be incorporated into the assessment of response to other modalities of curative treatments including radical radiotherapy. Additionally, ctDNA could be used to guide post-treatment surveillance through identification of patients in whom more intensive monitoring is warranted.

There was little consensus across studies regarding timing of ctDNA sampling. Three studies measured ctDNA both post-surgery and after completion of adjuvant chemotherapy, demonstrating the post-chemotherapy time-point to be a stronger predictor of prognosis [31, 36, 37]. In order to be of clinical utility, detection of MRD should be performed at a time when it is possible to influence disease management. Delay in commencing adjuvant chemotherapy beyond eight weeks is associated with worse long-term outcomes [38], meaning that post-surgical ctDNA timings will be a critical consideration when being incorporated into treatment pathways. Analysis should be performed once ctDNA from the primary tumour has been cleared from the circulation. Clearance of ctDNA following surgery was investigated by Chen et al. through serial measurement in the immediate post-operative period following resection of lung cancer; they showed that ctDNA continues to decrease until three days post-surgery and that detection past this time point correlated better with prognosis [39]. Another important consideration in assay timing is that cfDNA rises with physiological stresses, including surgery. Henriksen et al. recently investigated the sequence of cfDNA and ctDNA post-operatively in colorectal and bladder cancer; they found that short cfDNA rose and remained significantly elevated for four weeks following surgery and recommend repeat ctDNA analysis at four weeks for any patients in whom ctDNA is not detected immediately post-op [40].

Gene panel selection remains a challenge in many aspects of precision oncology. There was a wide variation in the breadth of gene panels in this review as a result of the combination of PCR and NGS-based techniques. More comprehensive gene/mutation panels will enable detection of rarer mutations [41], but bring the possibility of false positives from CHIP [8]. Some of the studies in this review investigated presence of germline mutations either by sequencing DNA from peripheral blood leucocytes or based on the ctDNA VAF.

A tumour-informed approach was adopted by 16 studies, tracking previously identified mutations. This personalised approach brings the advantage of improved specificity whilst also achieving a high sensitivity using PCR-based assays [42]. However, the need for individualised assay development will be more logistically difficult to incorporate into routine care.

An alternative approach to identifying somatic mutations is to assess epigenetic changes. Although technically more challenging to measure, methylation changes are more consistent across a cancer type and occur early in the cancer pathophysiology. Four papers in this review assessed gene methylation [29, 30, 43, 44]. Parikh et al. investigated both genetic and epigenetic changes in NGS analysis of 103 patients undergoing curative surgery for stage Iā€“IV colorectal cancer and concluded that integrating both genetic and epigenetic changes increases sensitivity for MRD detection [44].

Assay sensitivity is of significance in the setting of MRD, where disease bulk is low. Of our included studies, Suzuki et al. report the lowest LoD of 0.02% using ddPCR [27] (TableĀ 2). In three studies, none of the cohort had detectable ctDNA after surgery [23,24,25] (TableĀ 3), yet in all three studies, a subset of patients went on to relapse which may have represented ctDNA levels below the sensitivity of these assays. The majority of studies in this review measured pre-surgical ctDNA. In three studies, detection of ctDNA pre-surgery was a requirement for inclusion in the post-operative analysis [27, 45, 46], which may serve to remove ā€˜non-sheddersā€™ or ā€˜low sheddersā€™, a subset of patients whose tumour does not release ctDNA.

Statistical testing showed significant heterogeneity between studies, which is likely to affect the repeatability and external validity of this review. This remains the main limitation of this review and of application to clinical practice. Clinical heterogeneity will have arisen from differences in study design. Differences in the approach to removal of CHIP and requirement for ctDNA detection pre-operatively will have affected the pre-test probability of post-operative ctDNA detection. This review will also have been subject to methodological heterogeneity due to the range of assays used for ctDNA analysis. Subgroup analysis was performed to partially overcome this. There remained significant heterogeneity in subgroup analysis, probably as a result of the large number of contributing variables. Of note, statistical testing demonstrated no appreciable heterogeneity within the metastasectomy subgroup, confirming disease stage to be one of the sources of heterogeneity.

Many of the studies in this review were small and exploratory in nature. There was no minimum sample size for inclusion, resulting in the inclusion of a few studies with small numbers of patients. However, for inclusion in the meta-analysis there had to be sufficient participants for survival analysis calculation to be performed. Quality assessment looked at the likelihood of bias due to differences in the management of ctDNA-positive and -negative groups. For a ā€˜low biasā€™ score the treating clinicians had to be blinded to the ctDNA results, which was the case in 15 studies. A further significant source of bias would be confounding due to the effects of adjuvant chemotherapy with only 12 studies outlining the proportion of participants who received adjuvant chemotherapy. Overall, it was felt that bias due to the classification of interventions and measurement outcomes was low.

Funnel plot asymmetry was observed, suggestive of publication bias. This is likely due to inclusion of a number of smaller studies and was partly overcome by obtaining individual participant data where possible to calculate HRs. Whilst this might exaggerate the magnitude of effect, the fact that the association was consistently observed across the studies suggests a true relationship. In addition, sample size calculations were performed in four of the included studies, demonstrating that shorter PFS associated with ctDNA detection reaches statistical significance when suitably powered [29, 31, 37, 47]. Large scale observational trials are already underway to establish the prognostic implications of ctDNA detection following surgery. Preliminary results from the GALAXY trial demonstrated a significantly shorter PFS with ctDNA detection at both 4 and 12 weeks post-op, and a higher rate of ctDNA clearance with adjuvant chemotherapy [48]. Interventional trials are also underway investigating the effectiveness of ctDNA in directing adjuvant chemotherapy use [49] and recent results from the DYNAMIC trial demonstrated non-inferiority with ctDNA guided selection to adjuvant chemotherapy [50].

A further limitation of this review was the inclusion of participants with incomplete surgical resections within some of the studies, which would preclude the analysis of MRD. Inclusion of studies that did not test for matched germline mutations may have resulted in false positives due to CHIP. Patients who had undergone curative treatment by other modalities such as chemoradiotherapy were not included, as this was outside the scope of this review.

Conclusions

To conclude, ctDNA detection after curative surgery for colorectal cancer is a marker of poor prognosis. Here we demonstrate for the first time via meta-analysis that ctDNA detection post-operatively is associated with a significantly shorter PFS. Despite this wide body of evidence, there remains no consensus on many logistical aspects, most notably in the timing and method of analysis resulting in the considerable heterogeneity of this review and remains the greatest limitation to the clinical utility of this phenomenon.