Symptoms and patient factors associated with longer time to diagnosis for colorectal cancer: results from a prospective cohort study

Background: The objective of this study is to investigate symptoms, clinical factors and socio-demographic factors associated with colorectal cancer (CRC) diagnosis and time to diagnosis. Methods: Prospective cohort study of participants referred for suspicion of CRC in two English regions. Data were collected using a patient questionnaire, primary care and hospital records. Descriptive and regression analyses examined associations between symptoms and patient factors with total diagnostic interval (TDI), patient interval (PI), health system interval (HSI) and stage. Results: A total of 2677 (22%) participants responded; after exclusions, 2507 remained. Participants were diagnosed with CRC (6.1%, 56% late stage), other cancers (2.0%) or no cancer (91.9%). Half the cohort had a solitary first symptom (1332, 53.1%); multiple first symptoms were common. In this referred population, rectal bleeding was the only initial symptom more frequent among cancer than non-cancer cases (34.2% vs 23.9%, P=0.004). There was no evidence of differences in TDI, PI or HSI for those with cancer vs non-cancer diagnoses (median TDI CRC 124 vs non-cancer 138 days, P=0.142). First symptoms associated with shorter TDIs were rectal bleeding, change in bowel habit, ‘feeling different' and fatigue/tiredness. Anxiety, depression and gastro-intestinal co-morbidities were associated with longer HSIs and TDIs. Symptom duration-dependent effects were found for rectal bleeding and change in bowel habit. Conclusions: Doctors and patients respond less promptly to some symptoms of CRC than others. Healthcare professionals should be vigilant to the possibility of CRC in patients with relevant symptoms and mental health or gastro-intestinal comorbidities.

intervals between the onset of cancer symptoms and the patient's presentation to healthcare, compounded by unequal access to optimal diagnostics and specialist care, leading to more late-stage diagnoses (Neal and Allgar, 2005;Maringe et al, 2013). There are also gender, age and region differences in time to diagnosis across the United Kingdom .
The diagnostic process comprises a series of stages or intervals, each contributing to the overall period of time between onset of symptom/s and initiation of treatment (Weller et al, 2012). The total diagnostic interval (TDI) comprises the patient interval (PI -from first symptom onset to first healthcare consultation) and the health system interval (his -from first consultation via referral and investigations to diagnosis and initiation of treatment) (Weller et al, 2012): shorter PIs and HSIs are associated with improved CRC prognoses (Torring et al, 2011;Neal et al, 2015). For CRC, recent evidence suggests that both intervals are longer than for other common cancers such as lung or urological cancer (Dregan et al, 2013;Keeble et al, 2014;Neal et al, 2014a;Lyratzopoulos et al, 2015b).
Colorectal cancer patients may be symptomatic for many months before presentation. They may experience multiple symptoms, both bowel-specific (rectal bleeding, change in bowel habit and abdominal pain) and systemic (loss of weight or appetite and fatigue) (Hamilton et al, 2005;Rasmussen et al, 2015). Not all individuals interpret their initial symptoms as serious and may attribute them to normal variation, ageing, lifestyle or other comorbidities (Macleod et al, 2009;Hall et al, 2015;McLachlan et al, 2015). International comparisons suggest that lower agerelated risk and highest perceived barriers to symptomatic presentation were reported in the United Kingdom (Forbes et al, 2013); other potential influences include negative beliefs about cancer outcomes (Beeken et al, 2011;Lyratzopoulos et al, 2015a) and poor awareness of the risk of cancer (Simon et al, 2010;Quaife et al, 2014).
Once the decision to seek medical help has been made, further delays may occur. One third of CRC patients have three or more consultations with a general practitioner (GP) before referral, compared with only 17.9% for all cancers (Lyratzopoulos et al, 2013). Furthermore, the pathway to diagnosis in primary care may be complex, with delays arising when presentation is complicated by comorbid conditions, when referral is delayed or declined, or from false-negative investigation .
Symptoms of possible CRC much more commonly arise from benign or self-limiting conditions and GPs face a considerable challenge to evaluate patients while making efficient use of hospitalbased resources (Hansen et al, 2015). Much of the evidence about the predictive value of symptoms for cancer or their association with later diagnosis comes from retrospective studies of people with CRC (Esteva et al, 2013) or from general practice data sets (Hamilton et al, 2005); these are limited by quality of data recording. Little is known about the diagnostic pathways of those whose symptoms transpire not to be caused by CRC or about which symptoms are associated with less timely diagnosis. We therefore recruited a prospective cohort of patients with symptoms suggestive of CRC at the point of their referral. We aimed to investigate the symptoms, clinical factors and socio-demographic factors associated with CRC diagnosis and time to diagnosis.

MATERIALS AND METHODS
Setting and governance. We recruited patients from four secondary care hospitals in the East (n ¼ 1) and North East of England (n ¼ 3) between December 2010 and March 2013. We gained all appropriate NHS ethics (reference: 10/H0306/50) and clinical governance approvals. The SYMPTOM colorectal study was conducted alongside the SYMPTOM lung and pancreas studies, collectively part of the NIHR-funded DISCOVERY programme of applied research. The methods of data collection and analysis have already been described with the reporting of the SYMPTOM lung study  and hence will only be outlined briefly here.
Patient recruitment. All GP referral letters to urgent and routine colorectal and gastrointestinal clinics in participating hospitals were reviewed by a research nurse, to identify patients aged X40 years whose referral mentioned any one or more symptoms from a pre-specified list of symptoms known to be associated with CRC (Supplementary Material). These patients were sent an information sheet, the SYMPTOM colorectal questionnaire and a reply-paid envelope for return. Patients were not eligible for the study if they were already undergoing treatment for any cancer (excluding nonmelanotic skin cancer) or had such serious mental and/or physical disease that they were not considered suitable to complete a questionnaire, based on review of the medical record by the research nurse. Patients referred as a result of participation in the Bowel Cancer Screening Programme were excluded. No reminder letters were sent to non-responders, as required by the research ethics committee.
Data collection. We based our data collection, analysis and reporting on the Aarhus statement for the conduct of cancer diagnostic studies and STROBE guidelines for reporting observational studies (von Elm et al, 2007;Weller et al, 2012).
Patient data. Patient data were collected using the SYMPTOM questionnaire, completed and returned to the research team within 3 months of being sent; most completed it within 2 weeks and before receiving their diagnosis. The SYMPTOM colorectal questionnaire was derived from the previously validated C-SIM questionnaire (Neal et al, 2014b). This had been developed for use among people recently diagnosed with cancer and hence we modified it for use before diagnosis, consulting widely with clinical and research colleagues, and patient representatives, to ensure it was worded sensitively, as cancer may not have been explicitly mentioned as a possibility by the referring GP. The questionnaire asked for the first symptom noticed by the participant and then sought the presence or absence of specific symptoms (Supplementary Material). The remaining sections contained items about other symptoms and patient factors, including demographic characteristics and comorbidities.
Primary care data. Participants' GPs completed a short proforma from the patient's clinical record, including dates of the first presentation with any symptom from the questionnaire within the previous 2 years, plus its duration, if recorded.
Hospital data. We extracted data from hospital medical records on the following: date and type of referral (urgent, routine, emergency and other); date of first specialist consultation; dates of investigations and their findings; date of diagnosis (histological and clinical); and dates of MDT meetings and their clinical decisions. Data extraction took place a minimum of 6 months after recruitment, to allow sufficient time for completion of investigation and initiation of treatment. We undertook double data abstraction of a 5% sample of selected hospital data (dates of referral, first appointment, diagnosis and stage) and confirmed an acceptable level of agreement (480% for dates, 490% for diagnosis and stage, Cohen's k40.75 for all), with any discrepancies generally minor.

Data handling
Clinical outcomes. We used the date on the first confirmatory histology report as the date of diagnosis when this was available for both cancer (ICD codes C18-C20) and non-malignant conditions (Weller et al, 2012). Otherwise, we used the first date of the clinical diagnosis in the hospital medical record. Participants were classified into three groups: those with primary CRC (CRC), other intra-abdominal cancer (OC) and no cancer (NC). The main analyses focused on CRC vs NC, with secondary analyses including all cancers (CRC plus OC) vs NC. Colorectal cancer staging was categorised using TNM status at diagnosis (Travis et al, 2012) and further categorised into early stage (stages I and II) and late stage (stages III and IV). Complex diagnoses, or cases with incomplete data, were agreed by an adjudication group of clinicians associated with the study (FW, JE, GR and MR).
Demographic and clinical variables. Demographic details included the following: gender; age (treated as a continuous variable); ethnicity (coded as White vs non-White); smoking status; educational status; occupational status; living alone; and postal code, used to assign groups to quintiles of the Index of Multiple Deprivation (1 'least deprived' to 5 'most deprived'). Clinical variables relating to comorbidities included the following: gastrointestinal disease (inflammatory bowel disease, irritable bowel syndrome and peptic ulcer); respiratory disease (chronic obstructive pulmonary disease/asthma/other lung disease), anxiety/ depression, heart disease, diabetes and arthritis. Family history of cancer was also identified (present vs absent).
Symptoms. All symptoms reported up to 2 years before diagnosis were entered into the analysis (symptoms of longer than 2 years duration were unlikely to be associated with the subsequent diagnosis) (Hamilton et al, 2005;Biswas et al, 2015). Where participants provided estimated dates, these were converted to exact dates by adapting an algorithm used in the C-SIM trial (Neal et al, 2014b;Walter et al, 2015). We used exact dates where available and estimated dates otherwise. Furthermore, if a participant's unprompted, or free text, first symptom matched their response to a question about specific symptoms, they were given the corresponding symptom code and the earlier date. A first symptom was thus identified for each participant. Many participants reported more than one first symptom, termed 'multiple first symptoms'. 'Subsequent symptoms' were defined as any symptom occurring after a first symptom and before diagnosis.
The TDI. The TDI, or 'time to diagnosis', was defined as the time from onset of the first symptom/s to the date of diagnosis. Where the date of first presentation to healthcare was known, the PI, defined as the time from first symptom/s to the first presentation, and the HSI, defined as the time from first presentation to diagnosis, were also calculated.
Analysis. Descriptive analyses were performed on all demographic, clinical and symptom data for the group as a whole and by diagnostic group (CRC, all cancer (CRC plus OC) and NC). The CRC group was also sub-analysed by stage at diagnosis. The TDI and PI for each group was compared using Wilcoxon rank-sum test.
Clinically relevant demographics, comorbidities, first symptom/s and family history of cancer variables were a priori included in multivariate analyses, to identify predictors of time to diagnosis. Variables present in fewer than 10 participants were excluded. We chose to analyse CRC, OC and NC groups together for all outcomes, because at the time of presentation and referral the final diagnosis was unknown. Flexible parametric survival models were used to model the TDI, the PI and the HSI. In these cases, rather than death, the event considered to be 'failure' in the survival model is either first presentation to healthcare or diagnosis as appropriate. We preferred flexible parametric survival models (Lambert and Royston, 2009) over the Cox model, as they allow the investigation of duration-varying effects. We present results from two models: first, with only time-constant effects and, second, one that allows for duration-varying effects for symptoms. The former model provides average hazard ratios over time for all variables similar to a Cox model and makes it easier to compare the effects of different factors. The second model allows us to examine in more detail how their effect varies with duration. Symptoms with duration-varying effects were identified using a forward selection approach described in Royston and Lambert (2011) and using a significance level of Pp0.01. All models used splines with five degrees of freedom for the baseline hazard; the duration-dependent effects were modelled using two degrees of freedom for the TDI and HSIs, and three degrees of freedom for the PIs.
Sample size. The expected total number of cases in the 2 areas was 265 annually. Assuming half of these could be identified prospectively from fast track and routine referrals including via colonoscopy (Barrett and Hamilton, 2008), we needed to recruit for 2 years to achieve 200 participants with CRC. We estimated one patient in 25 would have CRC (6-11% in 2-week wait clinics (Rai and Kelly, 2007), but lower out with these clinics). With 200 participants with CRC and at least 10 times as many without we would have over 80% power to detect and OR41.52 for a common symptom occurring in half of participants. With the same 2200 patients, all of which having a TDI, we would expect to have 80% power to detect an HR41.23 for a symptom occurring in 10% of patients.

RESULTS
Patients (12 236) were approached and 2667 responded, an overall 21.8% response rate (East 26.3% and North East 18.6%). Of these, 70 did not meet the eligibility criteria and a further 90 had insufficient data for analysis, leaving a final cohort of 2507 participants. The demographics of the responders were similar to those of non-responders (responders 47% male, median age 65 years; non-responders 45% male, median age 65 years). The stage distribution for those of our cohort who had CRC was comparable to 2013 data for England (late stage 50% vs early 40% and 10% unknown) (National Cancer Intelligence Network (NCIN)).
The characteristics of the whole cohort are provided in Tables 1  and 2. Participants were diagnosed with CRC (152, 6.1%; 56% late stage), other cancers (50, 2.0%) or no cancer (2305, 91.9%). The majority of those with CRC had late-stage disease (n ¼ 85, 55.9%; early stage: n ¼ 65, 42.8%); two CRCs (1.3%) were unstaged. The proportion of males was higher in the group diagnosed with CRC compared with the group diagnosed with NC (57.2% vs 46.2%, P ¼ 0.008). Those in the CRC group also had a higher median age (71 vs 65, Po0.001) and were more likely to have been referred through an urgent pathway (90.1% vs 71.1%, Po0.001). The diagnostic groups were otherwise similar in terms of deprivation, education and ethnicity (Table 2).
Among the total cohort, over half had a solitary first symptom (1332, 53.1% (CRC 92, 60.5%, NC 1218, 52.8%)). However, multiple first symptoms were common with 21.6% having two first symptoms, 9.3% having three and 8.4% with four or more multiple first symptoms. A few participants (7.6%) reported no symptoms first appearing within 2 years of diagnosis (most had symptom/s pre-dating the 2-year cut-off). Across the total cohort, change in bowel habit (62.8%) and 'bleeding from back passage' (37.8%) were the most common symptoms, but only 45% and 24.4% of those, respectively, were reported as first symptoms (Table 3). Symptoms not specifically enquired for in the questionnaire but volunteered, such as acute gastro-intestinal illness, perianal pain, flatulence, bloating and mucus discharge, were individually reported by o3% of participants. 'Bleeding from back passage' was significantly more frequently reported in people diagnosed with CRC than non-cancer as a first symptom (34.2% vs 23.9%, P ¼ 0.004).
No other symptom as a first or subsequent symptom was more common in those with CRC than those without.
A TDI could be calculated for 2316 participants; the median TDI across the whole cohort was 136 days (IQR 74-255) ( Table 4). There was no difference in the TDI between people diagnosed with CRC and NC (CRC 124 vs NC 138 days, P ¼ 0.142), and no evidence of differences between PI and HSI among those with and without CRC (PI median 41 days vs 36 days, P ¼ 0.606; HSI 49 days vs 59 days, P ¼ 0.078).
Results from the time constant survival models are shown in Table 5 (Supplementary Table A1 for univariate analyses). Older age at diagnosis and a number of symptoms were all associated with a shorter TDI, while gastrointestinal comorbidity, depression/anxiety and family history of cancer were all associated with a longer TDI and HSI.
Symptom duration-dependent effects. We found symptom duration-dependent effects for 'bleeding from back passage' and change in bowel habit (Supplementary Figure 1). The timeconstant effects from other variables in the same model are similar to those described in the previous section and are not detailed here. The symptom 'bleeding from back passage' was associated with briefer PI and HSI initially (HR41), but as the duration of this symptom lengthened the association weakened (HR became close to 1). In other words, when this symptom was present it was associated with faster action (presentation to healthcare professional and/or referral) in the initial period following symptom onset or presentation, compared with other symptoms, but if initial action was not taken the existence of the symptom had little impact on action at later times. In contrast, a change in bowel habit was associated with a lengthier PI initially (HRo1) but if the symptom persisted beyond 10 days it became associated with a briefer PI (HR41). In other words, those whose initial symptom was a change in bowel habit were less likely to present early than those with other symptoms, but of those who did not present in the first 10 days, they presented faster on average than people with other symptoms (HR41). Change in bowel habit was always associated with a briefer HSI, with no timedependent effect.
For early stage CRC, the median TDI for any first symptom was 157 days compared with 99 days for late stage CRC (P ¼ 0.019) ( Table 6).

DISCUSSION
This is the first study worldwide to recruit a large prospective cohort of patients with suspicious symptoms before diagnosis, to study factors associated with time to diagnosis for CRC. Among these patients, who had been referred for investigation, rectal bleeding was the only symptom more frequently seen with CRC, but it occurred in only a third of cases as a first symptom and less than two-thirds at any point before diagnosis. Other so-called 'alarm' symptoms such as change in bowel habit were also very common in this referred population but did not differentiate between CRC and those without cancer. The positive predictive value of these symptoms will be higher in this study population than in primary care, but the odds ratio between those with and without cancer will fall as this is an enriched, referred population. This does not mean these symptoms should not be taken seriously. Multiple first symptoms were common and symptoms often evolved over time before patients sought healthcare. There were different symptom and patient factors associated with the PI and HSI. Some less specific symptoms such as indigestion/abdominal pain or 'feeling different' were associated with shorter PIs; in contrast, only the very specific, 'classical', symptoms such as change in bowel habit and rectal bleeding were associated with shorter HSIs.
Our novel analysis of duration-dependent effects of symptoms showed different patient and healthcare provider responses to certain key symptoms of CRC according to how long the patient had, compared with other symptoms, experienced them. A short duration of rectal bleeding for some patients triggers an early consultation and prompt response by the healthcare system. Once rectal bleeding has been present for a longer time, it is no more likely than other symptoms to have a shorter diagnostic interval despite its recognition as a classical alarm symptom. This means that both patients and healthcare providers may normalise longerterm rectal bleeding and not consult or investigate promptly (Emery et al, 2013).
Comorbidities in the whole cohort were also associated with longer total diagnostic and healthcare intervals. Importantly, people with mental health problems, self-reported anxiety or depression, experienced a longer TDI and HSI, suggesting that their possible physical symptoms were not taken as seriously and were investigated later. Similarly, gastro-intestinal comorbidity increased time to diagnosis, probably due to healthcare providers attributing new or worsening symptoms to pre-existing illness (Emery et al, 2013;Hall et al, 2015).
There was no evidence that those with CRC were diagnosed more quickly than those with an alternative diagnosis. This is surprising given the fact that there were a higher proportion of a Percentages add to 4100%, because 54 patients have more than one diagnosis. More than one diagnosis in the same category (e.g., miscellaneous) was counted only once. cancers using the urgent pathway and guidance on urgent referral for suspected CRC in England recommends that 'alarm' symptoms such as rectal bleeding and change in bowel habit warrant urgent referral for investigation (NICE, 2005). Symptoms other than rectal bleeding in this relatively large prospective cohort did not help differentiate CRC from other diagnoses.
The median PI was 35 days and median HSI (combining primary and secondary care) was 58 days. This contrasts with findings of the secondary analysis of CRC data in the National Audit of Cancer Diagnosis in Primary Care, where data from primary care medical records showed a median PI of 19 days (Lyratzopoulos et al, 2015b). In a separate study using the same data set, 21% of CRC patients had three or more primary care consultations before referral, potentially contributing to longer HSIs (Lyratzopoulos et al, 2013). However, in an analysis of UK General Practice Research Database data for 2962 patients with  CRC in 2007/8, the median HSI, of 80 days, was more comparable to our own findings (Neal et al, 2014a). Key strengths of this study were the prospective design and the collection of data from several sources: patient reports and primary care and specialist records. The analytical and reporting approaches were robust and performed according to the methodological approaches and definitions recommended in the Aarhus statement (Weller et al, 2012) and STROBE guidelines (von Elm et al, 2007). We chose to define the date of first symptom/s using the patient-reported date rather than the primary care-reported date, as we were analysing patient-reported symptom/s. Ideally, a study would recruit patients from primary care before referral; however, this would be accompanied by major logistical and resource implications in identifying an extremely large prospective cohort of patients with colorectal symptoms, to capture sufficient patients with cancer. Instead, we recruited patients when they first encountered secondary care; this had the added benefit of allowing us to recruit patients presenting as emergencies and those referred from other specialists. Recruitment involved two regions of England, selected to ensure a broad range of socio-economic, educational and occupational levels, and the deprivation data suggest that the cohort was reasonably representative of the national population.
The main study limitation is the overall recruitment rate of 22%, although this is similar to other recent studies (Kidney et al, 2015;Walter et al, 2015). We sought to make contact with the target population before they underwent investigation and received a diagnosis, and it is possible that some were unable or unwilling to complete a questionnaire at this worrying time. We are also likely to have under-recruited people who presented as an emergency or who died soon after presentation. However, a recent study from the English Cancer Patient Experience Study showed that only 6% of eligible patients died between sampling and mail-out, suggesting potential survival bias in these types of study is relatively small (Abel et al, 2016). The demographics of non-responders were very similar to participants and the proportion of late-stage CRC was comparable to national data, suggesting our cohort was reasonably representative, that selection bias was not a major issue, and that the findings can be generalised to similar symptomatic populations. If sicker patients were less likely to take part in the study and they were more likely to have shorter intervals, we may have overestimated the typical intervals in the population. We may well have also underestimated differences between those with different presenting symptoms. In common with a Danish prospective population-based study of diagnostic intervals, the problem of confounding by indication remains to some extent (Torring et al, 2012). Our results did not find that shorter time to diagnosis was associated with earlier stage disease. It may be that late-stage disease has different symptom profiles, which affect help-seeking and use of diagnostic pathways. Tumour factors (such as histological type, rate of growth and location) and host factors (such as comorbidity) can influence diagnostic intervals and result in apparently earlier diagnosis of later stage disease. As our study set out to investigate the diagnostic intervals from perception of first symptom to diagnosis, our primary exposure was the initial symptom or As these are median time intervals (rather than mean intervals), there is no expectation that the median TDI will equal the sum of median PI and median HSI. b The PI and HSI could only be calculated for 2103 participants (unknown presentation date n ¼ 213). Those with an available date of presentation had a shorter median TDI (130 days vs 199 days for the remaining cases).
symptoms. Although this approach allows us to make robust comparisons with other findings, it may obscure the finer detail of symptom patterns and clusters as they evolve over time and their effects on timely help-seeking by patients and timely diagnosis in primary and secondary care. Although this was a large cohort, we had insufficient power to examine specific clusters of symptoms and their associations with outcomes; a much larger prospective study would be required to achieve this. The fact that fewer CRC cases were recruited than was the aim meant that the study was only powered to detect large differences in symptoms between those with and without CRC.
This study shows that there are subtle differences in the impact of symptoms and patient factors on patient and healthcare intervals, with some clear implications for policy makers and clinicians. Although rectal bleeding was the only symptom predictive of CRC in this referred population, it was only reported as the first symptom in one-third of cases and as a subsequent symptom in a further 25% of cases. Despite conducting such a large prospective cohort study, we failed to identify any other strong solitary symptom signals of CRC, suggesting that bowel cancer awareness campaigns, which currently concentrate on a single symptom, should also consider messages that reflect the importance of multiple symptoms and evolution of symptoms over time (Moffat et al, 2015). The recently revised NICE guidelines for early detection of CRC support this premise and have also lowered the threshold for referral, in line with patient preferences for investigation (Banks et al, 2014a, b). However, our study has also shown that only people presenting with shorter histories of rectal bleeding are investigated promptly, and that healthcare professionals should remain alert to symptoms of possible CRC in people with a history of gastro-intestinal or mental health conditions. The increasingly widespread use of clinical decision support in primary , smoking status (current and ex-smoker vs never), living alone (yes vs no) and region (North East vs Cambridge). In this context, the HR represents the relative increase in rate of presentation/diagnosis. A HR of 2 would imply that patients in one group presented/were diagnosed twice as quickly as in the reference group. care can also be informed by our findings (Dikomitis et al, 2015;Green et al, 2015), but further research is needed, alongside GPs and specialists, to identify mechanisms by which patients can be identified, referred and diagnosed in the most timely and appropriate way.
In conclusion, as efforts to expedite the diagnosis of symptomatic CRC are likely to have benefits for patients in terms of improved survival, earlier-stage diagnosis and improved quality of life, it continues to be a priority to identify symptoms and other factors, which should prompt an individual to seek help or a GP to refer in an appropriate and timely manner. It is also important to develop other strategies for earlier diagnosis, including increasing uptake of CRC screening and perhaps the development of biomarkers to improve early detection. Nevertheless, these data provide support for more targeted evaluation of suspicious symptoms in an attempt to identify CRC at an earlier and more amenable stage. It may also be that targeted interventions at higher risk populations aimed at symptom monitoring could be more effective at recognising symptom evolution.