Introduction

Interval colorectal cancers are a small, but clinically significant, subset of colorectal cancer that refer to colorectal cancers diagnosed between a negative colonoscopic examination and the next recommended colonoscopy [1,2,3,4,5]. The reported frequency ranges from 3–9% [6,7,8,9,10,11,12,13,14,15,16,17], with an estimated pooled prevalence of 3.7% [16]. Understanding the mechanisms behind the development of interval colorectal cancer can guide strategies to enhance the efficacy of colonoscopy in preventing colorectal carcinoma.

Missed cancers or precursor adenomas, incompletely excised precursors, or a unique pathogenetic pathway that progresses rapidly to adenocarcinoma have all been postulated as possible explanations for the development of interval colorectal cancer [16,17,18,19,20]. Sporadic colorectal cancer arises through multiple pathogenetic pathways that result in cancers with a distinct morphological and genetic profile [21,22,23]. The majority arise through the conventional adenoma pathway and culminate in aneuploid, microsatellite stable (MSS) colorectal carcinoma. In contrast, the serrated polyp pathway accounts for ~15–20% of all colorectal cancers [22] and leads to tumors with high microsatellite instability (MSI-high) and CpG island methylation (CIMP-high) [21, 24, 25]. Prior studies have reported a higher prevalence of MSI-H and CIMP-high phenotype in interval compared to non-interval colorectal cancer [11, 26, 27]. This suggests that the serrated pathway may play a larger than expected role in the pathogenesis of interval colorectal cancer. However, interval colorectal cancers occur more frequently in the right colon [6,7,8,9,10,11,12,13,14,15,16,17], and the MSI-H/CIMP-H phenotype is more common in that location too [27,28,29], making tumor location a significant confounder when evaluating the role of serrated pathway in the pathogenesis of interval colorectal cancer.

In this study, we compared the clinicopathologic and molecular profiles of interval and non-interval colorectal cancers matched for age, gender, and tumor location to determine the presence of any histologic or molecular signature specific for interval colorectal carcinoma.

Materials and methods

Clinicopathologic evaluation of interval colorectal cancer

The pathology archives at our institution were searched for all colorectal adenocarcinomas diagnosed between 2007–2012. Recurrent adenocarcinomas, metastatic carcinomas to the colon, colorectal cancer arising in the setting of familial adenomatous polyposis, inflammatory bowel disease, Lynch syndrome, or any hamartomatous polyposis syndrome were excluded from further evaluation, as were those with no residual tumor in the resection specimen. The study was approved by our Institutional Review Board.

Interval colorectal cancer was defined in this study as cancer detected in a diagnostic examination prior to the next recommended colonoscopy and at least 1 year after the last colonoscopy. To further reduce misclassification of sporadic colorectal cancers as interval colorectal cancer, a sensitivity analysis defining interval colorectal cancer as cancer diagnosed <5 years but at least 1 year after the last colonoscopy was also performed.

Interval and non-interval colorectal cancers were then matched 1:1 on age, gender, and tumor location using a statistical software-based algorithm. Tumors up to the splenic flexure were classified as involving the right colon and those below as the left colon. A predetermined set of clinicopathologic variables was recorded for each matched pair of colorectal cancer. This included smoking status, current aspirin or folate use from medical chart review; tumor size and grade, pathologic T, N, and M stage, presence of mucinous, medullary or signet-ring cell differentiation, lymphovascular invasion and presence of any conventional or serrated precursor lesion adjacent to the carcinoma from review of the original H&E slides. Distinguishing a precursor lesion adjacent to cancer from colonization of the mucosa by the tumor can be challenging. A determination of a precursor lesion was only made in our study when a clear spectrum of low-grade adenoma evolving into high grade dysplasia and cancer was seen lateral to the invasive tumor or a non-dysplastic serrated polyp was similarly present adjacent to the carcinoma. DNA mismatch repair (MMR) protein expression status by immunohistochemistry, including MLH1 (Novocastra, clone NCL-L-MLH1; 1:75 dilution), MSH2 (Cal Biochem- EMD Millipore, clone NA27; 1:200 dilution), MSH6 (BD Bioscience, clone PU29; 1:50 dilution), and PMS2 (Cell Marque, clone MRQ-28; 1:100 dilution), was also evaluated in all interval and matched non-interval colorectal cancer.

Assessment of molecular profile of interval colorectal cancer

The molecular genotype of interval and non-interval colorectal cancers matched 1:2 on age, gender, and tumor location was also determined in a subset of cases. DNA from archival paraffin blocks, with at least 20% tumor cellularity was analyzed by a custom hybrid-capture next generation sequencing assay that interrogates the full coding sequences of 309 genes for mutations and copy number variations, as well as 113 selected introns across 35 genes for rearrangements [30, 31]. The complete list of genes is listed in Supplementary Table 1.

Targeted sequences were captured using a solution phase Agilent SureSelect hybrid capture kit (Agilent Technologies, Santa Clara, CA, USA), and massively parallel sequencing performed on an Illumina HiSeq 2500 sequencer (Illumina, San Diego, CA, USA). Mutation calls were made using Mutect25 and GATK software26–28 (Broad Institute, Cambridge, MA, USA) and gene-level copy number alterations were assessed using VisCap Cancer (DFCI, Boston, MA, USA). The sequence reads were aligned and processed through a bioinformatics pipeline to identify single-nucleotide variations and small insertions–deletions. Any single nucleotide variant present at >0.1% in Exome Variant Server was filtered, unless being classified as pathogenic in the Catalogue of Somatic Mutations in Cancer (COSMIC; cancer.sanger.ac.uk). Single nucleotide variants and indels were manually reviewed for significant mutations including (i) loss of function mutations (splice site disruption, frameshift, nonsense) as well as hotspot missense mutations for tumor suppressor genes; [32] (ii) missense hotspot mutations for oncogenes; [32] (iii) pathogenic gene mutations listed in COSMIC.

Gene-level copy number variations were quantified as a ratio of fractional coverage of each exon in the tumor sample normalized against the fractional coverage of the corresponding exon in a panel of normal tissue controls. Circular binary segmentation was then used to assemble exons into contiguous multi-exon regions. The copy number data for each segment was then displayed visually and interpreted manually for copy number gains/losses [30, 31]. High copy number gain (“amplification”) was defined as 6 copies or above.

Statistical analyses

Comparison of categorical clinicopathologic variables was done using McNemar’s test and conditional logistic regression. Unmatched comparison between interval and non-interval colorectal cancer was performed using Χ2 test and Fisher’s exact test and Wilcoxon signed rank test was used for comparing tumor size. Frequencies of genetic alterations were compared using conditional logistic regression with correction for multiple comparisons when appropriate. All analyses were done using Stata/SE version 11 (StataCorp, College Station, TX).

Results

Study group characteristics

A total of 1106 carcinomas involving the colorectum were diagnosed between 2007–2012 at our institution. Of these, 124 were excluded from further analysis [recurrent adenocarcinomas (n = 53); metastatic carcinomas (n = 18); carcinomas in patients with a history of familial adenomatous polyposis (n = 6); inflammatory bowel disease (n = 23); Lynch syndrome (n = 14); Cowden syndrome (n = 1); no residual tumor in resection specimen (n = 6); and tumor tissue exhausted in subsequent levels (n = 3)] leaving 982 primary colorectal cancers in the initial study group. Cases excluded due to lack of additional tumor tissue were all non-interval colorectal cancers based on our study definition.

Fifty-one of the 982 primary colorectal cancers (5.2%) in the initial study group met our criteria for diagnosis of interval colorectal cancer. Indications for diagnostic colonoscopy in interval colorectal cancer patients were bleeding (23), anemia (11), abdominal pain (9), change in bowel habits (3), and colonic mass seen on imaging in patients undergoing follow up for history of extra-colonic malignancies (3). The exact indication was uncertain in the remaining two patients but the colorectal cancer was diagnosed in a colonoscopy performed prior to the recommended surveillance examination. When the definition of interval colorectal cancer was restricted to those diagnosed <5 years after last colonoscopy, 4.0% (n = 39) of all colorectal cancers in the initial study group qualified as interval colorectal cancer. Overall, 82% (42/51) of all interval colorectal cancer, in our study, were diagnosed <5 years after last colonoscopy. The mean age of the entire study population (n = 982) was 71 years (range: 18–91 years), 51% of the patients were male and 63% of the tumors involved the left colon. Unmatched comparison between interval and non-interval colorectal cancers showed that interval cancers were more likely to occur in the right colon (55% vs. 36%; p = 0.02) and in patients older than 70 years (55% vs. 34%; p = 0.002) (Table 1). The 28 right colon interval colorectal cancer were located in the cecum (8), ascending colon (5), transverse colon without (6), or with involvement of splenic flexure (2), and tumor location was simply described as “right colon” in the remaining 7 resections. The 23 left colon tumors involved the descending colon (2), sigmoid (5), rectosigmoid (3), rectum (12), and tumor site was mentioned as “left colon” in the remaining case. The frequency of interval colorectal cancers by year during the study period (2007–2012) varied between 3.7–6.7%. This difference was not statistically significant (p = 0.81).

Table 1 Patient demographics and tumor location in initial study cohort including interval and non-interval colorectal cancers (CRC) (n = 982)

Clinicopathologic profile of matched interval and non-interval colorectal cancer

When interval colorectal cancers (n = 51) were compared to non-interval cancers (n = 51) matched on 10-year age group, gender and tumor location, no significant difference was seen between the two groups in tumor size and grade, T stage, N stage, presence of mucinous, medullary or signet-ring cell differentiation, or presence of lymphovascular invasion and distant metastasis. There was also no difference between interval and matched non-interval colorectal cancers in the proportion of smokers or current users of aspirin or folate (Table 2).

Table 2 Patient and tumor characteristics of matched interval and non-interval colorectal cancers (CRC)

Similarly, no significant difference in proportion of tumors with aberrant MMR expression was observed between the two groups (20% each; p = 1.00) (Table 3).

Table 3 Characteristics of colorectal cancer (CRC) with loss of MMR expression by immunohistochemistry

MMR-deficient tumors (n = 20; 13%) were seen mostly in women (80%) in this matched cohort, with 90% involving the right colon (Table 3). Most (n = 16) of these cases showed loss of MLH1 and PMS2 nuclear staining, a pattern more often caused by sporadic hypermethylation of MLH1 rather than a germline mutation. A minority revealed loss of MSH2 and MSH6 (n = 4) that is almost always due to germline MSH2 mutations in the setting of Lynch syndrome. All slides from the surgical resection specimens were available for review in 70% (71/102) of the matched cohort. In 48 (68%) colorectal cancers with all slides available for review, the mucosa showed extensive surface ulceration precluding an accurate evaluation of tumor precursors. In the remaining cases, conventional adenomas were the most commonly identified precursor lesion in both groups (22% in interval; 26% in non-interval). Compared to non-interval cancers, a higher percentage of interval colorectal cancers appeared to have large (≥1.0 cm) precursor (13% vs. 19%) than small ( < 1.0 cm) precursor (23% vs. 9%) lesions associated with the invasive tumor. There was no significant change in our study findings when the analysis was restricted to interval cancers diagnosed <5 years after last colonoscopy (Supplementary Tables 2, 3, 4), or when the two interval colorectal cancers with uncertain indication for colonoscopy were excluded from analysis.

Matched comparison of molecular alterations in interval and non-interval colorectal cancer

The genetic landscape was also evaluated in a subset of interval (n = 20) and non-interval (n = 40) colorectal cancers matched 1:2 on 10-year age group, gender, and tumor location.

Interval colorectal cancers most commonly showed mutations in APC (75%), TP53 (65%), KRAS (40%), and BRAF (15%). Recurrent copy number variations most frequently seen in interval colorectal cancer were TP53 (52%), SMAD2 (52%), SMAD4 (52%), SOX9 (48%), and EGFR (18%) (Fig. 1). A median of 42 (range: 0–112) and 29 (range: 0–142) gene copy number variations were detected in interval and non-interval colorectal cancer, respectively. The median number of single nucleotide variants was also similar between interval (median: 11; IQR: 6) and non-interval cancers (median: 9, IQR: 6) (p = 0.17), as was the frequency of genetic alterations in the five most common microsatellite-stable pathways involved in colorectal cancers (Table 4). Interval and matched non-interval colorectal cancer were also remarkably similar when the analysis was restricted to significant pathogenic gene variants (Supplementary Table 5). most commonly involved in colorectal cancer pathogenesis. No statistically significant difference was detected at the individual gene level either when comparing the two groups.

Fig. 1
figure 1

Distribution of single-nucleotide variants and copy number variations of genes involved in pathways most commonly involved in colorectal cancer

Discussion

Interval colorectal cancers, also described as post-colonoscopy cancers in literature, are a small but clinically significant subset of colorectal cancer. Understanding the mechanisms that underlie the development of interval colorectal cancer can provide insights for improving the performance characteristics of colonoscopy in the future. Interval colorectal cancers can develop due to missed lesions, incompletely excised precursors, or from carcinomas that may progress very rapidly through a unique pathogenetic pathway. The precise contribution of each of these mechanisms in the development of interval colorectal cancer remains uncertain. Evaluation of the pathologic and molecular features of interval and non-interval colorectal cancer can shed some light on this issue and provide evidence for the existence of any unique molecular subset of colorectal cancer capable of rapid progression.

The prevalence of interval colorectal cancer in our study was 5.2% and was associated with older age (≥70 years) and proximal location, consistent with prior literature [17]. Overall, 20% of interval colorectal cancers were MMR deficient by immunohistochemistry, and this subset showed a female predominance also consistent with previous studies [5,6,7,8,9,10,11,12,13,14,15, 28, 29]. Large precursor lesions were seen associated with carcinoma in more than half of interval colorectal cancesr, where the surface mucosa could be examined, suggesting that these tumors likely arose from missed or incompletely excised lesions rather than de novo high grade carcinomas. MSI-high phenotype has been reported in literature to be more prevalent in interval compared to non-interval colorectal cancers [11, 27, 28, 33]. In our study, 20% of all interval cancers were MMR deficient which is twice the frequency of around 10% that we see in clinical practice at our institution when colorectal carcinomas are routinely screened for Lynch syndrome by immunohistochemistry [34]. However, when matched for age, gender, and tumor location, no significant difference was seen between interval and non-interval colorectal cancer for MMR protein expression (Tables 34) and for mutations or copy number changes in MMR pathway genes. Moreover, interval and non-interval matched colorectal cancers showed a remarkably similar genetic profile involving mutations in APC, TP53, KRAS, BRAF, PIK3CA, and other genes involved in the microsatellite-stable pathway suggesting that missed or incompletely excised neoplastic lesions are a significant contributor to interval cancers and it is highly unlikely that a distinct molecular pathway leads to rapid progression to cancer in these patients.

Table 4 Genetic alterations in interval and non-interval tumors involving the five major pathogenetic pathway in colorectal cancer (CRC)

The prevalence of missed lesions on colonoscopy has been estimated to range from 2–13% [8, 33, 35,36,37,38,39], with higher miss rates being associated with older age and proximal location [33]. The latter finding is most likely related to the higher prevalence of flat precursor lesions in the right colon that may be more difficult to detect and completely excise compared to their more polypoid and pedunculated counterparts in the left colon. Incomplete excision of precursor lesions occurs in 7–31% of polypectomies [40], with higher rates, associated with larger polyp size and sessile serrated adenoma/polyp histology [25, 40]. In a prospective study, we previously showed that incomplete polyp resection rates were 6.8% for polyps between 6–10 mm, 17.3% for polyps >10 mm, and 47% for sessile serrated polyps between 10–20 mm in size [40]. Some prior studies have reported statistically significant associations between interval colorectal cancers and MSI [11, 26, 27]. Sawhney et al. reported that interval cancers, after adjusting for age, were 3.7 times more likely to show MSI than non-interval cancers (95% CI: 1.5–9.1). In addition, interval cancers in the distal colon were estimated to be 17.5 times more likely to harbor MSI compared with non-interval cancers (95% CI, 1.81–170.21) [27]. In another study by the same group, Arain et al. reported the CIMP-high (odds ratio 2.41; 95% CI 1.2–4.9) and MSI-H (odds ratio 2.7; 95% CI 1.1 – 6.8) phenotype to be independently associated with interval colorectal cancer, after adjusting for tumor location as a possible confounder in a multivariable model [26]. It is important to emphasize that 70% of all interval colorectal cancers in the study by Sawhney et al., were microsatellite stable . which again suggests that majority of interval colorectal cancer arise through the conventional adenoma-carcinoma sequence and not a unique rapid progression pathway. Moreover, 50% of tumors reported as MSI-H [27] were CIMP - negative in the subsequent study by the same group [26] which is surprising giving the tight correlation between CpG island methylation and microsatellite instability. This discrepancy is either explained by technical issues in the MSI and/or CIMP assays used in the study or a high prevalence of Lynch syndrome patients in the study cohort. However, the mean age of patients in the interval cancer group was 75 years and it is unlikely that a high percentage of these patients had undiagnosed Lynch syndrome. Interestingly, in their most recent study, the same group analyzed BRAF mutations in the same patient cohort, and found no significant difference in mutation frequency in the interval and non-interval cancer groups despite the known strong correlation between BRAF mutation and the MSI-H and CIMP-H phenotype [33].

Interval colorectal cancers have been defined with durations ranging from <1 year to over 10 years after a negative colonoscopy in different studies [2, 18, 19] and are, therefore, likely to be a heterogeneous group depending on the study definition and population characteristics. The possible mechanisms postulated above for the pathogenesis of interval cancers are not mutually exclusive [18, 19]. Robertson et al. [20] estimated that 52% of interval cancers were likely due to missed lesions, 19% due to incompletely excised lesions, and 24% due to possible new lesions. In our study, 82% of interval cancers developed within 5 years since last colonoscopy, with 33% being early stage pT1 or pT2 tumors that could conceivably represent new lesions but the possibility of origin from missed or incompletely excised precursors cannot be ruled out with certainty. A recent study on post colonoscopy cancers estimated that interval colorectal cancers occur at the same site as a prior adenoma in 40% of cases and at a different site in the remaining 60% [41]. This further supports incomplete resection as a significant contributor to the pathogenesis of interval colorectal cancer. Importantly, these authors also showed an increasing trend towards development of interval colorectal cancers which comprised 5.7% of all colorectal cancer in 2005, in their study, but jumped to 13.6% in 2016. It has also been shown recently that nearly half of all post-colonoscopy cancer have potentially modifiable factors which should be addressed in order to increase the effectiveness of colonoscopy [42].

Our study has unique strengths and some limitations as well. This is a single institution experience from a tertiary academic center where the prevalence of interval colorectal cancers over the study period was similar to prior published literature. Instead of single gene testing we used a multiplexed next generation sequencing assay to interrogate over 300 cancer associated genes to see if any unique signature could be detected in interval colorectal cancers that might explain a rapid progression pathway. We pursued a study design where confounders of age, gender, and tumor location were matched between interval and non-interval cancers. However, if the association between MSI and interval colorectal cancer is truly present but small in magnitude, our study may be limited in power to detect it with statistical significance due to our relatively small sample size. In fact, a recent large population based study of 10,365 incident colorectal cancers from Denmark did show a minor independent contribution by MMR deficiency (OR 1.26; 95% CI: 1.00–1.59) in post-colonoscopy cancers [17]. However, this study is quite different from ours in that colorectal cancers in the setting of Lynch syndrome and inflammatory bowel disease were also included and the study population was from a time period when colonoscopies for average risk patients were not being performed in Denmark. Colonoscopy quality indicators, such as bowel preparation quality and withdrawal time were not addressed in this study but this data from our cohort has been published elsewhere [43]. The primary aim of our study was to determine if serrated pathway tumors or tumors with high-grade morphology and aggressive colorectal cancer subtypes such as signet ring cell, micropapillary, or mucinous adenocarcinomas are over-represented among interval colorectal cancers and whether the molecular signature of interval tumors reveals a unique pattern not seen in non-interval carcinomas.

In summary, comparison of the clinico-pathologic and genetic landscape between interval and non-interval colorectal cancers matched on age, gender, and tumor location shows no significant differences in prevalence of DNA MMR deficiency in the two groups. Using next generation sequencing, the molecular signature of interval colorectal cancers is remarkably similar to non-interval carcinomas suggesting that interval colorectal cancer arise primarily from missed or incompletely excised lesions on colonoscopy. These findings highlight the importance of improving colonoscopy performance characteristics in order to reduce the incidence of interval colorectal cancer, which has been suggested in other recent studies [42].