Symptoms and other factors associated with time to diagnosis and stage of lung cancer: a prospective cohort study

Background: This prospective cohort study aimed to identify symptom and patient factors that influence time to lung cancer diagnosis and stage at diagnosis. Methods: Data relating to symptoms were collected from patients upon referral with symptoms suspicious of lung cancer in two English regions; we also examined primary care and hospital records for diagnostic routes and diagnoses. Descriptive and regression analyses were used to investigate associations between symptoms and patient factors with diagnostic intervals and stage. Results: Among 963 participants, 15.9% were diagnosed with primary lung cancer, 5.9% with other thoracic malignancies and 78.2% with non-malignant conditions. Only half the cohort had an isolated first symptom (475, 49.3%); synchronous first symptoms were common. Haemoptysis, reported by 21.6% of cases, was the only initial symptom associated with cancer. Diagnostic intervals were shorter for cancer than non-cancer diagnoses (91 vs 124 days, P=0.037) and for late-stage than early-stage cancer (106 vs 168 days, P=0.02). Chest/shoulder pain was the only first symptom with a shorter diagnostic interval for cancer compared with non-cancer diagnoses (P=0.003). Conclusions: Haemoptysis is the strongest symptom predictor of lung cancer but occurs in only a fifth of patients. Programmes for expediting earlier diagnosis need to focus on multiple symptoms and their evolution.

Lung cancer is the most common cancer worldwide. Most cases are diagnosed in symptomatic patients; the majority have late-stage disease and a poor prognosis (Cancer Research UK, 2014a). In the United Kingdom, lung cancer is the most common cause of cancer mortality (Cancer Research UK, 2014b). Fewer than 10% of those diagnosed with lung cancer survive for 5 years, and UK lung cancer patients have poorer survival than those in other countries (Abdel-Rahman et al, 2009). This may partly be due to longer time between the onset of cancer symptoms and the patient's presentation to health care, leading to more late-stage diagnoses and therefore less eligibility for potentially curative treatment (Holmberg et al, 2010). Late-stage disease at diagnosis is associated with socioeconomic deprivation, especially among older men, and those with 20 or more pack-years of smoking, even if they stopped smoking within the previous 10 years (Lyratzopoulos et al, 2012). In England there are also regional differences that may reflect socioeconomic deprivation in northern compared with southern regions.
Lung cancer patients are often symptomatic for many months before presentation, irrespective of their disease stage at diagnosis (Corner et al, 2005). They commonly experience multiple symptoms, both lung-specific (cough, breathing changes, chest pain and haemoptysis) and systemic (loss of weight or appetite, fatigue) (Corner et al, 2005;Hamilton et al, 2005). Those most at risk may not interpret their initial symptoms as serious, or may attribute them to ageing, lifestyle, smoking habits or other comorbidities (Corner et al, 2005(Corner et al, , 2006Brindle et al, 2012). International comparisons suggest that UK differences in cancer awareness and beliefs may contribute to later presentation (Forbes et al, 2013), and there is early evidence that approaches to improve symptom awareness result in earlier-stage lung cancer diagnosis, as well as increased numbers of chest X-rays and total lung cancer diagnoses (Athey et al, 2012).
In primary care, general practitioners (GPs) face similar difficulties in evaluating new or evolving symptoms suspicious of lung cancer. One-third of lung cancer patients have three or more pre-referral consultations compared with only 3% of patients diagnosed with breast cancer (Lyratzopoulos et al, 2013). Furthermore, the pathway to diagnosis in primary care may be complex, and delays may occur with presentation complicated by comorbidity, false negative chest X-ray reports and delayed or declined referral (Mitchell et al, 2013;Rubin et al, 2014).
Much of the evidence about which symptoms best predict cancer or are associated with later diagnosis comes from retrospective studies in people with a lung cancer diagnosis or from general practice data sets, which are limited by issues of data recording. Little is known about the diagnostic pathways of those with similar symptoms but ultimately other malignant and nonmalignant diagnoses. Less is known about which symptoms result in prompt or less timely diagnosis. We therefore recruited a prospective cohort of patients in two English regions at the point of their referral for suspected lung cancer. We aimed to investigate the symptoms and other clinical and sociodemographic factors associated with lung cancer diagnosis, time to diagnosis and stage at diagnosis.

METHODS
Setting and governance. We recruited patients in the East and North East of England who were referred to four secondary care (East 1, North East 3) and one tertiary care (East) hospital between December 2010 and December 2012. We gained appropriate ethics (reference: 10/H0306/50) and clinical governance approvals. The SYMPTOM lung study was conducted alongside the SYMPTOM colorectal and pancreas studies, collectively part of the NIHRfunded DISCOVERY programme of applied research.
Patient recruitment. All referral letters to urgent and routine respiratory clinics across the five sites were reviewed by a research nurse. Patients aged X40 years with any symptoms suspicious of lung cancer were sent a study pack; this included an information sheet, the SYMPTOM lung questionnaire and a freepost envelope to return the completed questionnaire to the research team. Exclusion criteria included people already undergoing treatment for any cancer (excluding non-melanotic skin cancer) and those with serious mental and/or physical disease. Patients were only approached on a single occasion; no follow-up letters were sent.
Data collection. Our approaches to data collection, analysis and reporting were based on the recommendations of the Aarhus statement for the conduct of cancer diagnostic studies (Weller et al, 2012).
Patient data. The SYMPTOM lung questionnaire, drawing on the C-SIM questionnaire (Neal et al, 2014b), was modified for use among people before diagnosis. Due to the sensitive nature of the subject matter, we consulted widely among clinical and research colleagues and patient representatives to achieve appropriate wording. The questionnaire starts with the question 'What was the first thing or symptom you noticed that made you think something might be wrong?' followed by nine specific symptoms (coughing up blood; cough or worsening of a long-standing cough 43 weeks, breathlessness 43 weeks, chest/shoulder pain 43 weeks, hoarseness 43 weeks; plus decreased appetite, unexplained weight loss, fatigue/tiredness, feeling different 'in yourself'). Exact or estimated dates were requested for all symptoms. The remaining sections contained items about other symptoms, and demographic and clinical details.
Primary care data. GPs completed a proforma from their clinical records, providing dates of the first presentation with any symptom listed in the SYMPTOM questionnaire within the previous 2 years, plus its duration before presentation, if recorded.
Hospital data. Study researchers extracted data from hospital medical records, including date of referral and route (urgent, routine, emergency, other); date of first consultation; investigations and findings; and diagnosis and date (histological, clinical, MDT meetings). Proformas were completed up to 6 months after recruitment to allow sufficient time for completion of investigation and initiation of treatment. Across both geographical sites, double data abstraction of a 5% sample of hospital data (dates of referral, first appointment, diagnosis and stage) confirmed an acceptable level of agreement (480% for dates; 490% for diagnosis and stage).

Data handling
Clinical outcomes. The date of diagnosis was based on the date on the pathology report where possible; the first date of clinical diagnosis in the medical record was used where pathology was unavailable. Participants were classified into three groups: those with primary lung cancer (LC), other cancer (OC) and no cancer (NC). The main analyses focused on LC vs NC, with secondary analyses including all cancers (LC plus OC) vs NC. Primary lung cancer staging was categorised using TNM status at diagnosis (Travis et al, 2011), and further categorised into early-stage (stages I and II) and late-stage (stages III and IV). Difficult or unusual diagnoses, or cases with incomplete data, were agreed by an expert clinical consensus group (FMW, JE, GR, RCR).
Demographic and clinical variables. Demographic details collected in the patient questionnaire included: gender; age (treated as a continuous variable); ethnicity (coded as white vs non-white); smoking status; educational status; occupational status; living alone; and postcode, used to derive national quintiles according to the Index of Multiple Deprivation (IMD) (1 'least deprived' to 5 'most deprived'). Clinical variables relating to comorbidities included respiratory disease (chronic obstructive pulmonary disease (COPD)/asthma/other lung disease), anxiety/depression, heart disease, diabetes and arthritis. A family history of cancer was also recorded (present vs absent).
Symptoms. Symptoms 42 years before diagnosis date were omitted from analysis, as we considered these unlikely to be associated with the developing disease (Ades et al, 2014); all other reported symptoms were included in analysis. Participants' estimated dates were converted by adapting an algorithm used in the C-SIM trial (Neal et al, 2014b). In brief, the mid-month date was used for 'a month'; mid-year for 'a year'; mid-April, mid-July, mid-October and mid-January for the seasons; and the actual dates for Christmas and Easter. We devised a second set of rules to allow for the combination of exact and estimated dates. If responses to the unprompted first symptom question matched a specific question, they were given the corresponding codes. The initial symptom was then identified for each participant. Many participants reported more than one initial symptom, termed 'synchronous first symptoms'. The initial and synchronous first symptoms were combined as 'first symptom/s'.
The total diagnostic interval. The total diagnostic interval (TDI), or 'time to diagnosis', defined as the time from the first symptom/s to the date of diagnosis, was calculated for all participants.
Analysis. Descriptive analyses were performed on all demographic, clinical and symptom data for the group as a whole, and by the diagnostic group (LC, OC and NC). The LC group was also described according to the cancer stage at diagnosis. Clinically relevant a priori demographics, comorbidities, first symptom/s and family history of cancer variables, and those significant at the 20% level in univariate analysis, were included in multivariate analyses. The referral variable, and those with fewer than 10 cases, were excluded. Logistic regression or Cox regression analyses were performed as appropriate. Sensitivity analyses were performed alongside each primary analysis to examine (i) all cancers (LC plus OC) vs NC, and (ii) the 'waiting time paradox' by excluding cases in which diagnosis occurred within 28 days of first symptom/s (Tørring et al, 2012).

RESULTS
A total of 5097 patients were approached and 995 were recruited, giving an overall 19.5% response rate (East 25.1%, North East 16.8%). The demographics of the responders were similar to those of non-responders (responders 54% male, median age 67 years; non-responders 49.7% male, median age 66 years). The disease stage distribution of our cohort was comparable to national data (late stage 68.2 vs 67.6%) (Cancer Research UK, 2014b). Twenty participants were excluded (returning questionnaire 43 months after diagnosis n ¼ 8, not meeting recruitment criteria n ¼ 4, recent metastatic disease n ¼ 2, recruited via screening trial n ¼ 1, no consent given to access hospital records n ¼ 5), and there were insufficient data for analysis for a further 12, leaving a final cohort of 963 participants.
Among the total cohort, only half had an isolated first symptom (475, 49.3%). Synchronous first symptoms were common, with 19.0% having two first symptoms, 8.8% having three and 410% with four or more synchronous first symptoms. 12.5% reported no symptoms within 2 years of diagnosis. Cough or worsening of a long-standing cough and breathlessness or worsening of long-standing breathlessness were the most common symptoms, and for each symptom more were reported as 'all symptoms' than 'first symptoms' suggesting the evolution of symptoms over time (Table 2). Symptoms not specifically mentioned in the questionnaire, such as backache, sickness/ indigestion and symptoms of acute respiratory illness, were individually reported by fewer than 10% of participants. Coughing up blood and unexplained weight loss at any time were infrequently reported in the whole cohort (13.5 and 11.1%, respectively) but were the only symptoms reported by significantly  more people diagnosed with lung cancer than with no cancer (21.6 vs 11.8%, P ¼ 0.001; 15 vs 9.2%, P ¼ 0.028, respectively). Cough or worsening of a long-standing cough was reported less commonly as a first symptom in people with lung cancer (39.9 vs 50.7%, P ¼ 0.014).The median TDI across the whole cohort was 117 days (IQR 50-269); people diagnosed with lung cancer had a significantly shorter median TDI than people diagnosed with no cancer (LC 91 vs NC 124 days, P ¼ 0.037) ( Regression analyses. Coughing up blood, cough or worsening cough and decreased appetite were significant predictors of shorter TDI for the total cohort (all Po0.001), whereas a respiratory comorbidity was associated with longer TDI (P ¼ 0.04) ( Table 4). Chest/shoulder pain was associated with shorter TDI for primary lung cancer (P ¼ 0.03). When the regression analyses were repeated for any cancer, lower educational level was also associated with longer TDI (P ¼ 0.003). For early-stage lung cancer, a self-reported family history of cancer and breathlessness or worsening breathlessness were associated with shorter TDI (Table 5). Increasing age and coughing up blood were the only factors that predicted lung cancer (Table 6 and Supplementary Online Material Table A2). Never smoking, or ex-smoking, were inversely associated with the risk of a lung cancer diagnosis, and demonstrated a dose-response relationship. Both arthritis and other respiratory diseases were associated with a lower risk of lung cancer. These factors remained significant in the regression analyses; however, the model explained only 15% of total variability.

DISCUSSION
This is one of the first studies worldwide to study diagnostic pathways for lung cancer by recruiting before diagnosis a large prospective cohort of patients with suspicious symptoms. Haemoptysis was the only symptom associated with lung cancer, but it occurred in just 21.6% of cases, and only 4.6% of cases as a first symptom; other associated factors were increasing age, smoking status and respiratory and arthritis comorbidities. Diagnostic intervals were longer for non-cancer than cancer diagnoses and for early-stage than late-stage lung cancer. Prolonged chest/shoulder pain was the only first symptom associated with a shorter diagnostic interval for lung cancer than for non-cancer diagnoses. Our findings show that people referred with symptoms suspicious of lung cancer often have complex symptomatology. Only half of our cohort reported an isolated first symptom; the majority developed multiple symptoms over time. This study set out to investigate the TDI from perception of first symptom to diagnosis; therefore, our unit of analysis was the initial symptom or first synchronous symptoms. Although this approach allows us to make robust comparisons with evidence reported from England and elsewhere, it may obscure the finer detail of symptom patterns and clusters as they evolve over time, and their effects on timely helpseeking by patients and timely diagnosis in primary and secondary care.
We found that people with lung cancer were diagnosed more quickly than those with an alternative diagnosis. This may reflect the guidance on urgent referral for suspected lung cancer in England (National Institute for Health and Care Excellence, 2005), which recommends that 'alarm' symptoms such as haemoptysis warrant urgent chest X-ray and referral. Symptoms other than haemoptysis in this relatively large prospective cohort study did not help differentiate lung cancer from other diagnoses, even though some, such as weight loss, can be indicative of advanced disease. This highlights the challenge for earlier detection in primary care for patients with less specific symptoms (Shim et al, 2014).
The median TDI for any symptom was 117 days, and 91 days for those with lung cancer. This remains a substantial period between a person first noticing a symptom and receiving a diagnosis. It is worth noting that a national 'Be Clear on Cancer' lung cancer campaign ran for 2 months (May and June 2012) during our recruitment period (Cancer Research UK, 2014c). The diagnostic intervals are broadly similar to evidence from a UK General Practice Research Database analysis (Neal et al, 2014a). Secondary analyses of a national audit of cancer diagnosis from primary care medical records also suggest that the symptoms and signs of lung cancer may be more quickly acted upon by patients than GPs: lung cancer patients had a median patient interval of just 12 days (Keeble et al, 2014), whereas more than 30% of lung cancer patients had three or more primary care consultations before referral (Lyratzopoulos et al, 2013).
In common with a Danish prospective population-based study of diagnostic intervals (Tørring et al, 2013), we found shorter median intervals associated with later stage at diagnosis, even after adjusting for the 'waiting time paradox'. This adjustment aims to account for patients who present with very short intervals and severe symptoms associated with late-stage disease, often presenting to emergency departments (Tørring et al, 2012). However, the problem of confounding remains to some extent even after this adjustment; late-stage disease may have different symptom profiles that affect help-seeking and diagnostic pathways. This suggests that tumour factors (such as histological type and location) and host factors (such as comorbidity) could influence diagnostic intervals and result in apparently earlier diagnosis of later-stage disease (Tørring et al, 2013).
Having respiratory comorbidity increased time to diagnosis across the cohort, but had a lower risk of being diagnosed with lung cancer, suggesting that the respiratory symptoms were associated with that comorbidity rather than lung cancer. The longer diagnostic intervals in people with respiratory comorbidities may be owing to the patient and their GP attributing new or  worsening symptoms to pre-existing illness (Emery et al, 2013;Birt et al, 2014). Persistent or worsening chest/shoulder pain was associated with a shorter time to cancer diagnosis, possibly because the symptom triggered an urgent referral or admission to hospital to exclude cardiovascular causes. Key strengths of this study are the prospective design and the collection of data from several sources: patient reports, and primary care and specialist records. The analytical and reporting approaches were robust and performed according to the methodological approaches and definitions recommended in the Aarhus statement (Weller et al, 2012). We chose to define the date of first symptom/s using the patient-reported date rather than the primary care-reported date, as we were analysing the patientreported symptom/s. Ideally, a study would recruit patients from primary care before referral; however, this would be accompanied by major logistical and resource implications of identifying a prospective cohort in primary care with respiratory symptoms with sufficient numbers of cancers. Instead, we recruited patients when first encountered in secondary care; this had the added benefit of recruiting patients admitted via the emergency route. Recruitment involved two regions of England, selected to ensure a broad range of socioeconomic, educational and occupational levels. The deprivation data suggest that the cohort was representative of the national population.
The main study limitation is the recruitment rate of 19.5% overall, ranging from 17% in the North East to 25% in the East of England, similar to other recent studies (McRonald et al, 2014). It is possible that many of the target populations were unable or unwilling to complete a questionnaire because they were coping with a serious diagnosis or undergoing treatment, regardless of final diagnosis. However, the demographic of our non-responders were very similar to those in the cohort, and the proportion of late-stage lung cancer was identical to national data. This suggests that we did not recruit a healthier cohort, and our findings are likely to be generalisable. Although this was a large cohort, we had insufficient power to examine specific clusters of symptoms and their associations with our outcomes; a much larger prospective study would be required to achieve this. The analyses focused only on the first symptom or symptoms. The impact of subsequent symptoms on time to diagnosis requires further study. This paper reports our data on the TDI and factors associated with this. Future analyses will explore the relative contributions of the patient interval (from first symptom/s to first presentation in primary care), the primary care interval (from first presentation to referral) and the secondary care interval (from referral to diagnosis) to the TDI.
In conclusion, identifying symptoms and other factors that should prompt an individual to seek help or a GP to perform the appropriate diagnostic test or refer appropriately remains challenging. Haemoptysis is the most important symptom associated with lung cancer, but this is reported as the first symptom in less than 5% of cases. Despite conducting such a large prospective cohort study, we failed to identify any other strong signals of lung cancer diagnosis. This is an important finding while we await the revised National Institute for Health and Care Excellence guidelines for early detection of lung cancer. This study suggests that lung cancer awareness campaigns that currently concentrate on a single symptom should instead consider messages that reflect the multi-symptom nature of its presentation. It may also be that targeted interventions at high-risk populations aimed at symptom monitoring could be more effective at recognising symptom evolution (Smith et al, 2013). Policy initiatives such as prompt chest X-rays for high-risk groups, and the increasingly widespread use of clinical decision support (Hamilton et al, 2013), can be informed by our findings. The next step is to understand the potentially subtle differences in impact of symptoms and patient factors on the patient, GP and specialist intervals. These data will provide support for more targeted evaluation of suspicious symptoms in an attempt to identify lung cancer at an earlier and more amenable stage.