Main

There were 44 888 new cases of lung cancer in the United Kingdom In 2012, making it the second most common cancer, representing 13% of all new cancers (Cancer Research UK, 2015). In the same year, there were 35 371 deaths; 10-year survival is 5%. Options for improving outcomes in lung cancer are either by diagnosing at an earlier stage, potentially allowing curative surgery, or by improving treatments. Options for earlier stage diagnosis include: screening programmes with low-dose CT for people at high risk (National Lung Screening Trial Research Team, 2013); targeted public awareness campaigns to encourage people with symptoms to present earlier (Athey et al, 2012; Ironmonger et al, 2015); GPs investigating higher risk patients with symptoms more quickly (as in this trial); allowing GPs access to low-dose CT (Guldbrandt et al, 2014); and the development of predictive biomarkers. There is systematic review evidence that, for lung cancer, the evidence that timelier diagnosis is associated with better outcomes is equivocal (Neal et al, 2015a); however the higher quality studies reported in this review do show an association (Brocken et al, 2012; Murai et al, 2012; Tørring et al, 2013), suggesting that there is merit in expediting the diagnosis of people presenting symptomatically.

The diagnosis of lung cancer in general practice can be complex and difficult (Neal et al, 2015b) and can therefore be ‘easily missed’ (Neal et al, 2014). The symptom signature of lung cancer is variable (Shim et al, 2014), and many patients consult with their general practitioner several times before referral or investigation (Lyratzopoulos et al, 2012; Birt et al, 2014). In England and Wales, diagnostic activity for cancer is guided by the National Institute for Health and Care Excellence (NICE); their 2005 guidelines (NICE, 2005) were updated in 2015 (NICE, 2015). In this trial we hypothesised that an intervention to lower the NICE symptom threshold for investigation ‘extra-NICE’ may improve clinical outcomes in lung cancer and the cost-effectiveness of lung cancer diagnosis. Our definition of ‘extra-NICE’ (as compared with NICE, 2005 as these were in place at the time of the trial) was urgent referral for a chest X-ray for patients aged 60 years or over presenting in general practice with a new or altered cough of any duration, increased breathlessness, or wheezing (whether or not associated with purulent sputum), and who were either current smokers or ex-smokers with a 10 or more pack-years of smoking. At the time of the trial, the NICE (2005) guidelines recommended chest X-ray after 3 weeks of symptoms in these patients. The new guidelines (NICE, 2015) were published after the trial closed, recommend chest X-ray within 2 weeks in people aged 40 and over if they have two or more of selected unexplained symptoms (cough, fatigue, shortness of breath, chest pain, weight loss, appetite loss), or one of these if they have ever smoked.

This trial aimed to assess the feasibility and inform the design of a definitive, fully powered, UK-wide, Phase III clinical trial of a change to the NICE guidelines for urgent investigation of suspected lung cancer in patients aged over 60. In particular, this paper reports the results and lessons drawn from the development work and the mixed methods feasibility trial that set out to determine: the acceptability of trial design and materials; the training and recruitment of practices; the recruitment and randomisation of patients; appropriate methods for collection of wider clinical and health economics data in a full trial; and the views of participants and non-participants on recruitment and randomisation, and health-care professionals. The trial utilises a combination of workshop, health economics, quality of life, qualitative, and quantitative methods, examining the feasibility of a full trial.

Materials and methods

ELCID (Early Lung Cancer Investigation and Diagnosis) was a randomised, parallel group, unblinded, feasibility trial carried out in general practices in Wales and Yorkshire. A Working Group was formed before recruitment and randomisation with key stakeholders. It aimed to identify the best way to train GPs to identify and recruit eligible patients into the trial, and to identify the most effective method of presenting the trial (and randomisation) to patients. The outputs from this Working Group informed some of the practicalities of recruiting practices and participants and maximising data collection.

Study design, randomisation, sample size considerations, data collection procedures, and planned statistical methods have been published elsewhere (Hurt et al, 2013). Briefly, patients aged over 60 years, who were either smokers or ex-smokers with 10 or more pack-years of smoking history, and who presented at a general practice with a new or altered cough of any duration, or increased breathlessness or wheezing (whether or not associated with purulent sputum), were invited to participate. Patients who qualified for urgent referral for chest X-ray under NICE, 2005 guidelines, or who had had a chest X-ray or CT scan of the chest in the last 3 months, or who needed a chest X-ray within next 3 weeks, or who had previously been diagnosed with cancer and who had a life expectancy of less than 1 year, were ineligible.

Initially, all general practices in the four Health Boards in South east Wales were invited to participate in the trial. In addition, six practices in Yorkshire were recruited by the local research network (as were a further six practices from north Wales halfway through the trial – see later). Patients were identified either at presentation or, in some practices where resources existed to do so, by looking through the records of the previous week’s consultations. Patients identified at presentation were introduced to the trial and invited to consent or given a future appointment at which eligibility could be confirmed and consent could be taken. Patients identified through medical records were invited to make an appointment to discuss the trial, eligibility, and consent.

The trial aimed to assess:

  • Key design parameters (prevalence of ‘extra-NICE’ symptoms, trial consent rate, and lung cancer rate).

  • Whether the intervention caused increased anxiety or depression

  • The outcomes of the chest X-rays.

  • The acceptability of trial procedures to participants and recruiters.

  • The best ways of collecting trial data (patient recall, GP records, and routine data sets).

The study was developed on behalf of the NCRI Primary Care Clinical Studies Group, funded by the UK National Awareness and Early Diagnosis Initiative (NAEDI), sponsored by Bangor University, and coordinated by the Wales Cancer Trials Unit and South East Wales Trials Unit at Cardiff University. The study was approved by a UK multicentre ethics committee and carried out in accordance to the declaration of Helsinki and all UK regulatory requirements.

Those who were eligible and provided written informed consent were then individually randomised (1 : 1) to either an urgent chest X-ray or usual care (NICE, 2005). Patients were randomised and informed of their allocation before being given a suite of questionnaires to complete (Box 1). The same questionnaires were posted to patients 2 months later. Twelve months after randomisation, general practices were contacted to provide follow-up information on the outcomes of any chest X-rays, health service resource use, and clinical outcomes. Data from the Welsh Cancer Intelligence and Surveillance Unit (WCISU) were used to verify the lung cancer data provided by general practices at 12 months (Wales only). The CTU staff maintained regular contact with the practices throughout the study. Full excess treatment and service support costs were provided. In order to incentivise practices further, there were occasional prizes for practices for best or most improved recruitment.

Randomisation

Randomisation took place centrally via the Bristol Randomised Trials Collaboration using either an automated telephone or a secure online service. Participants were randomised using minimisation with a random element (80 : 20) stratified by general practice, age (<75, ⩾75), and chronic obstructive pulmonary disease diagnosis.

Statistical analysis

The sample size was based on modelling work using the CAPER-lung database, which contains all primary-care consultations, symptoms, and investigations in the 2 years before diagnosis for a 5-year cohort of 247 lung cancer cases (Hamilton et al, 2005; Barrett and Hamilton, 2008). We estimated that 0.77% of practice patients would be eligible for the trial in a year, that as many as 70% of those may be recruited, and of those recruited, 2.4% would develop lung cancer (Hurt et al, 2013). The trial aimed to recruit a sample size of 386 for reasonable precision around these estimates: the proportion of eligible patients in United Kingdom practices per year (0.77%, 95% confidence intervals (CIs): 0.70–0.83%) and of those recruited, the proportion developing lung cancer within 12 months (2.4%, 95% CIs: 0.9–3.9%).

Data were analysed using the STATA SE 14 statistical package (StataCorp LP, 4905 Lakeway Drive, TX, USA) according to intention-to-treat. All analyses were pre-specified a priori. When determining the lung cancer data (proportion diagnosed, stage of lung cancer, performance status, and radical treatment) the WCISU data set was taken as correct if it differed from the general practice data. Recruitment rates were assessed by: Primary Care Research Incentive Scheme (PiCRIS) level – a Welsh Governmental funding and support scheme for general practices (NHS Wales, 2014); practice size; and months open to recruitment. We compared the completion rates of all questionnaires by trial arm, using χ2-tests. Subgroup analyses of completion of all questionnaires was also assessed by age (<75 vs ⩾75 years), gender, post-randomisation anxiety, ethnicity, and smoking status. Post-randomisation anxiety and depression median scores were compared using Mann–Whitney tests. Subgroup analyses of anxiety and depression by age (<75 and ⩾75 years), gender, and smoking status (current or ex-smoker) using Mann–Whitney tests were also performed. Differences in anxiety/depression scores between arms at 2 months post randomisation were assessed using multiple linear regression, with treatment arm and post-randomisation score as covariates in a base model. Subgroup analyses were performed by:

  1. a)

    repeating this model by age (<75 or ⩾75), gender, and smoking history (current or ex).

  2. b)

    adding age, gender, or smoking history to the base model together with an interaction with treatment arm.

To ensure the validity of the linear regressions, checks were made for evidence of non-normality of residuals using Shapiro–Wilk tests and kernel density, normal probability and normal quantile plots, and heteroscedasticity using residuals vs. fits plots. No adjustment was made for multiple testing.

Health economics

Health economic data were collected post randomisation and at 2 months as shown in Box 1. The health service use information from the patients was compared with data about resource use collected from the GPs to ascertain the most appropriate method of capturing these data in the full trial. Data were analysed using the IBM SPSS Statistics Version 22 statistical package (IBM Corp., Armonk, NY, USA). The agreement analysis examined the level of agreement between the patient and GP as information sources. Data included: investigations (‘Other’ chest X-ray (i.e., chest X-ray not including the one received on the ELCID trial as part of the intervention), CT scan, MRI scan, and PET-CT scan) performed on patients by 2 months. The Kappa (κ) statistic was employed to test investigation use agreement (Shoukri, 2015). The Landis and Koch standard scales for strength of agreement for the κ coefficient were applied (Landis and Koch, 1977).

Results

Practice and patient recruitment

The acceptability of the trial design was determined by the Working Group, and was reflected in the successful initial recruitment in Wales and Yorkshire. Twenty-eight practices were recruited from Wales and six from Yorkshire. However, three Welsh sites withdrew from the study without recruiting a participant. Twenty-two of the remaining 31 practices randomised 255 participants between 8 November 2012 and 9 April 2014. The CONSORT flow diagram is shown in Figure 1.

Figure 1
figure 1

Consort diagram. *Box shows actual numbers from complete screening data from two sites. Figure in square brackets is the overall prediction using the numbers from the two sites as an estimate to predict likely screening at other sites that recruited at least one patient.

The acceptability of the trial design was determined by the Working Group, and was reflected in the successful initial recruitment in Wales and Yorkshire. A total of 28 practices were initially recruited (22 South east Wales and 6 in Yorkshire), of which three later withdrew, without recruiting any patients. Halfway through the trial, an additional six practices were recruited from north Wales by the local research network to boost recruitment. Twenty-two of the remaining 31 practices randomised 255 participants between 8 November 2012 and 9 April 2014. Nine sites failed to recruit a single participant. Those that failed to recruit cited lack of resources or lack of eligible patients as reasons. The CONSORT flow diagram is shown in Figure 1.

Recruitment screening log data were poorly completed generally, but two practices provided reliable data, based on agreement between participation recorded on screening logs and central randomisation data. From these two practices, the prevalence of the trial eligibility criteria was calculated as 0.75% of practice registrations per year (214 eligible people per year from a combined registered population of 28 502; 95% CI 0.65–0.86%). In addition, from these two sites, the proportion of eligible patients who agreed to participate was 74 out of 223 (33.2%; 95% CI 27.0–39.8%).

The recruitment of participants by practice size, known research activity (PiCRIS scheme – Welsh practices only), and number of months open to recruitment is shown in Table 1. This shows that: practice size had little overall effect on the number of randomised patients per 1000 per year; the PiCRIS level of the practice had little effect on the number of randomised patients per 1000 per year, other than ‘sessional’ (the highest level of participation) practices having a higher randomisation rate; and that practices which were open to recruitment for shorter periods of time had a higher rate of recruitment.

Table 1 Breakdown of recruitment by practice size, known research activity (PiCRIS scheme), number of months open to recruitment

Baseline comparability between the groups

The breakdown of the variables, overall, and by trial arm, are shown in Table 2. This shows that the demographics (gender, age, ethnicity, duration and highest level of education, and economic situation), presenting symptoms, pre-existing co-morbidity, and numbers of recent chest X-rays had similar distributions in each trial arm.

Table 2 Baseline demographics, co-morbidity, symptoms, and previous chest X-rays of trial participants

Fidelity of the intervention

About 113 out of 127 (89.0%) patients in the intervention arm received a total of 148 chest X-rays within 12 months. Of these, 88 patients (69.3%) had their first X-ray within 7 days, 100 (78.7%) within 14 days, and 104 (81.9%) within 21 days. Forty-three out of 128 (33.6%) participants in the control arm also received a total of 50 chest X-rays within 12 months. Of these, 3 patients (2.3%) had their first X-ray within 7 days, 10 (7.8%) within 14 days, and 13 (10.2%) within 21 days. The median times to chest X-ray in the intervention and control groups were 3 days (interquartile range (IQR) 1–7), and 71 days (IQR 17–233, respectively). In the intervention arm, 10 (7.9%) patients had repeat chest X-rays within 3 months (8 had 1 repeat, 2 had 2 repeats with <3 months between each). In the control arm, 3 (2.8%) patients had repeat chest X-rays (1 patient had 1 repeat, 2 had 2 repeats with <3 months between each).

Significant adverse events

Three patients in the control arm died, one due to lung cancer (52 days post randomisation), one due to myocardial infarction (178 days post randomisation), and one due to cardiac arrest (216 days post randomisation). One patient in the intervention arm died, due to lung cancer; time to death was 95 days. None of these events were thought to be related to study procedures.

Survey responses and Hospital Anxiety and Depression Scores

Survey responses are shown in Table 3. This shows that completion of all tools post randomisation was good (89.4%), and equal in both the groups. Completion of tools at 2 months was also similar between the two groups, with just less than two-thirds completion. There was no evidence for differences in completion between arms within subgroups based on age, sex, post-randomisation anxiety, or smoking history (data not presented). Hospital Anxiety and Depression Scores are also presented in Table 3. There was no evidence of a difference in post-randomisation anxiety scores between trial arms (median (IQR): 6 (3–8) in control vs 5 (3–9) in intervention, z=0.32; P=0.75) or between arms by subgroups based on age, sex, or smoking history (P⩾0.35). Similarly, there was no evidence for a difference in post-randomisation depression between arms (median (IQR): 6 (3–8) in control vs 5 (3–9) in intervention, z=−0.12; P=0.91) or between arms by subgroups based on age, sex, or smoking history (P⩾0.56). The 2-month results presented in Table 3 suggest no difference in either anxiety or depression between trials arms except when performing subgroup analyses using age. For both anxiety and depression, the model that included age as a covariate had significant age-trial arm interaction terms (P=0.02 and 0.05, respectively). There was a significantly higher anxiety (median (IQR): 7.5 (4.5–9) vs 4 (3–7), t=−2.97; P=0.006) and depression (7.5 (4.5–9) vs 4 (3–7), t=−2.67; P=0.013) score in the control arm in the ⩾75 age group although the number of participants in this subgroup was small (n=33).

Table 3 Survey response at baseline and 2 months, and Hospital Anxiety and Depression Scores – 2-month questionnaires

Chest X-ray outcomes, cancer diagnoses, other diagnostic investigations, and other diagnoses

Table 4 and Figure 2 show the outcomes of the chest X-rays. There were more normal results in the intervention arm (105 (70.9%) vs 25 (50%)) but more patients were found to have at least one abnormality in the intervention arm (31 (25.8%) vs 20 (16.3%)). In the intervention arm, four chest X-rays from three participants were suspicious of lung cancer (3 of which led to CT scans within 3 weeks of X-ray, 2 of which led to a diagnosis of lung cancer) compared with one in the control arm (which led to a diagnosis of lung cancer within 34 days of randomisation without CT scan); 20 days of chest X-ray. Data from WCISU confirmed the three lung cancer diagnoses reported from primary care. In the intervention arm, one was stage IA, adenocarcinoma, WHO performance status 0, diagnosis 58 days post randomisation, and the participant received radical treatment. The other was stage III, missing tumour type, performance status 3, diagnosis 80 days post randomisation, and the participant received no radical treatment. In the control arm, this was a small cell tumour, stage 4 at diagnosis, performance status 2, diagnosis 39 days post randomisation and the participant received no radical treatment. Thus, 3 out of 256, that is, 1.2% (95% CIs: 0.2–3.4%) of trial participants were diagnosed with lung cancer. A flow chart showing pathways to other outcomes from normal and abnormal chest X-rays, including further investigations for both trial arms is shown in Figure 2.

Table 4 Chest X-ray outcomes
Figure 2
figure 2

Flow chart showing all outcomes from chest X-rays in both the groups. I=Intervention, C=control.

Exploration of the most appropriate measure of collecting health resource use data – agreement analysis results

Table 5 shows the results of agreement between the resource use data self-reported by the patients and the resource use data collected routinely from the GP records. Overall there was a high degree of agreement between patients’ self-reported data and routine-based health-care resource use data, with values ranging from 67.67 to 97.60%. Agreement was the highest for PET-CT scan and lowest for ‘other’ chest X-ray. However, the strength of agreement was considerably below average, ranging from fair agreement (κ=0.32, s.e. 0.18, 95% CI −0.03–0.66) for CT scan, to slight agreement (κ=0.20, s.e. 0.06, 95% CI 0.08–0.31) for ‘other’ chest X-ray. No results of kappa are presented for either MRI or PET-CT; this is because no comparison was possible as none were reported. The full findings from the health economics analysis will be reported elsewhere.

Table 5 Level of agreement between self-reported data and case report form data collected from the GP records (routinely collected data) by type of investigations use in the 2 months post baseline (n=167)

Discussion

Summary of main findings

This feasibility trial has achieved its targets in terms of assessing: key design parameters, impact on anxiety/depression, chest X-ray outcomes, the acceptability of trial procedures, and the best ways of collecting trial data. It supports the feasibility of conducting a Phase III trial. Our trial suggests that the prevalence of the trial eligibility criteria was 0.75% (95% CIs: 0.65–0.86%) of practice registrations per year, matching our estimate in the protocol. However, the proportion of eligible patients who agreed to participate was considerably lower than expected (33.2%; 95% CI 27.0–39.8%), explaining our lower than anticipated recruitment of 255 participants rather than the target 386. In addition, the trial suggests that the proportion of trial participants who developed lung cancer was 1.2% (95% CIs: 0.2–3.4%) – this is in keeping with our a priori estimate of 2.4%. The same three cancers identified within the practices were also identified by WCISU. The abnormal chest X-rays undertaken in both arms triggered further investigations, again as anticipated. We also identified more cardiac and other lung pathologies in the intervention than control group, although we are unsure of the clinical significance of this. We were able to collect questionnaires from patients’ post randomisation and at 2 months with acceptable return rates. There was some evidence that anxiety and depression may have been higher in the control arm in those aged ⩾75 years.

This was a challenging feasibility trial to undertake because we did not know at the outset whether it would be acceptable to recruit practices and patients to a study that individually randomised patients with possible lung cancer to quicker diagnostic testing. We were explicit in our patient recruitment materials that participants were at higher risk of lung cancer and that the intervention was a test to detect lung cancer earlier. We believe that this is the first trial that has recruited and randomised patients in this context. Through the use of good methods and engagement, such as a Working Group, training all relevant practice staff, practice payments for extra time taken, research network staff support wherever possible, use of practice databases to identify eligible patients wherever possible, we were able to recruit practices and participants albeit at a lower rate than anticipated.

Discussion of the findings within the context of the literature

The acceptability of trial design and materials, the training and recruitment of practices, and the recruitment and randomisation of patients

The work that we put into the Working Group enabled us to make minor changes to the trial design that made the trial more acceptable to both practices and patients. This was demonstrated in our success in recruitment and randomisation. Although we thought that obtaining consent from patients may be more difficult, we elected to use an individually randomised design rather than a cluster design. This may have led to GPs having to take more time to explain the trial to patients accompanied by the administrative burden of individually randomising patients, with a consequent diminished consent rate. However, we found little evidence of contamination in the control arm seeking urgent chest X-rays.

Anxiety and depression

Our finding of higher anxiety in the over 75 s in the control group fits with a Danish trial that participation in a randomised lung screening trial was associated with negative psychosocial consequences (Aggestrup et al, 2012).

The views of patients and recruiters

The importance of qualitative work in feasibility studies in establishing best recruitment methods has been highlighted by others (Fletcher et al, 2012; White and Hind, 2015). We successfully engaged with patient trial participants, patient non-participants, and recruiters from practices. The findings from this, which will be reported in full elsewhere, will inform the design of the subsequent main trial. Briefly, recommendations are to ensure recruiting sites are equipped to reduce misconceptions and misunderstandings of trial purpose, processes, and interventions when seeking consent. There will be less emphasis on smoking in the patient-facing materials. In terms of recruitment sites, the need for a strong point of contact for trial promotion and knowledge of processes is advised.

The appropriateness of methods for collection of wider clinical and health economics data in a full trial

The 2-month questionnaire return rates were >60% – thought to be crucial in surveys to avoid bias from outliers (Kiess and Bloomquist, 1985). In the agreement analysis, the percentage of agreement between self-report data and routine-based health-care resource use data from patients’ GP records for all four types of investigations were generally high, and line with other studies (Ungar and Coyte, 1998; Bhandari and Wagner, 2006). These findings suggest that self-report of health service use over a reasonably short recall period in this older population group seems valid. The completion rates for the tools that we used were acceptable, and we were able to collect useful information on presenting symptoms and co-morbidity. Quite importantly, we did not see a difference in anxiety and depression scores between trial arms. The fidelity of the intervention was good, in that 82% of participants in the intervention arm received a chest X-ray within 3 weeks. The proportion of participants in the control arm who received a chest X-ray was similar to what we anticipated.

Strengths and weaknesses

The main strength of the study is that we have conducted a pragmatic feasibility individually randomised trial in primary care, and fully tested all of the processes involved for a full trial. The main weakness is that in the protocol we aimed to recruit 386 participants, but we only recruited 255. This has lowered the precision with which we can estimate the key parameters required in designing the Phase III trial. We are also aware that the age and smoking thresholds used in this trial are different from some of the screening trials. Our screening log data may not be generalisable as it was based on two practices only. In addition the number of patients recruited per practice varied, with two practices contributing a third of the participants.

Implications for policy, practice, and research

In summary, we believe that we have demonstrated the feasibility of this trial design. However, while this raises several significant challenges (Box 2), we believe that there is a need for a trial that aims to diagnose symptomatic lung cancer earlier. This hypothesis has been helped by a recent study that has estimated the symptom lead time in lung cancer (the duration of reported symptoms before diagnosis) finding no relationship between duration of symptoms and stage (Biswas et al, 2015). The data for this study were collected in the era before the considerable expansion of diagnostic services for suspected cancer had occurred, and it is possible that results in a study conducted now would differ. That paper also reported the concept of two populations with lung cancer who are identified in any programme of enhanced diagnostics: those whose symptoms are caused by the cancer, and those whose symptoms met criteria for investigation, but were not actually caused by the cancer.

In conclusion, we have demonstrated the feasibility of recruiting to an individually randomised controlled trial in primary care for earlier chest X-ray for patients at higher risk of lung cancer who present to primary care with new symptoms. We are now developing a Phase III trial with the intention of evaluating the effect of timelier symptomatic diagnosis on lung cancer outcomes. While NICE guidelines have changed since we undertook the feasibility trial, we believe that our findings will inform a trial of very similar design; that is, individual randomisation of people at greater risk for expedited investigation or usual care. Our findings are also likely to have implications for other studies of urgent investigation for serious conditions.

Ethics and trial registration

NHS ethics was approved by the North Wales Research Ethics Committee (11/WA/0222) on the 25 August 2011. ClinicalTrials.gov NCT01344005.