Introduction

Many countries have implemented referral guidelines for patients with respiratory alarm symptoms in order to expedite diagnosis of lung cancer and reduce the diagnostic interval, thus increasing survival rates1. Presentation of alarm symptoms in primary care settings is often preceded by patients’ recognition of potentially serious symptoms. Qualitative studies have demonstrated that patients with lung cancer often experience symptoms months before diagnosis, but do not interpret such symptoms as serious enough to warrant seeking health care2. Furthermore, public awareness of lung cancer symptoms is low, and public knowledge about risk factors other than smoking is sparse3. Therefore, many health campaigns have been conducted with the aim of increasing public awareness of these alarm symptoms as indicative of cancer in hopes of reducing the interval before lung cancer diagnosis4,5. Despite awareness that smoking is an important risk factor for lung cancer, smokers are less likely to seek medical attention when experiencing respiratory alarm symptoms6.

In order to identify early signs of lung cancer, retrospective surveys of symptom experiences prior to diagnosis have revealed prolonged cough, dyspnoea and hemoptysis as key respiratory alarm symptoms7. However, such symptoms also often precede common, more benign conditions; on the other hand, many lung cancer patients experience both tumour-specific symptoms such as coughing and dyspnoea as well as systemic, nonspecific symptoms of malignancy such as weight loss and loss of appetite8,9. Based on the data from a nationwide survey of symptom experiences in the general population, we have analysed the prevalence of several symptoms10,11,12,13 as well as the predictive value of gastrointestinal alarm symptoms14,15. Lung cancer is one of the most common worldwide, causing high morbidity and mortality. The prognosis of lung cancer is highly dependent on stage of disease at the time of diagnosis, and since many patients are diagnosed at advanced stages, the survival rates of lung cancer are poor16. The literature examining predictive values of specific and nonspecific symptoms of lung cancer is, however, sparse, and prospective studies that systematically record such symptoms and explore their predictive values for lung cancer diagnosis are needed2,17. We therefore conducted this study, aiming to (1) assess lung cancer alarm symptoms reported by individuals prior to lung cancer diagnosis, and (2) analyse the predictive values of lung cancer alarm symptoms experienced in the general population. Furthermore, our aim was to analyse the association between experiencing single or multiple respiratory alarm symptoms and receiving a lung cancer diagnosis within 6 months or 12 months, and also to analyse how smoking status and reported contact with general practitioners (GPs) regarding lung cancer alarm symptoms influence predictive values of the symptoms.

Results

A total of 4747 (4.7%) of the 100,000 individuals invited to answer the questionnaire were not eligible because they were deceased, unknown address, severe illness, language barrier or emigration. A total of 95,253 subjects were eligible for the study, of whom 69,060 were ≥40 years. Among the eligible individuals ≥40 years, 37,455 (54.2%) completed the questionnaire. A total of 47.3% (17,701) of respondents age 40 years or older were male (Fig. 1).

Fig. 1: Study cohort.
figure 1

Flow chart of the study cohort selection process.

We found that 6,261 (16.7%) of respondents ≥40 years reported at least one specific alarm symptom. The most frequently reported symptom was cough lasting more than 4 weeks, which was reported by 3,400 (9.1%) of the respondents (Table 1).

Table 1 Prevalence of symptoms, smoking status and GP contact among respondents 40 years and older.

Our analyses of the socioeconomic characteristics of eligible individuals aged ≥40 demonstrate that respondents were younger and with higher rates of cohabitation and labour market affiliation, higher educational and income levels, and were more often of Danish ethnicity when compared with non-respondents. The details of the analyses are reported elsewhere15. A total of 20% of the respondents stated that they were daily smokers. In a national health survey from 201318, 17% of the population were daily smokers, indicating that the smoking behaviour of our respondents was comparable with the general population.

The number of incident lung cancer cases among the respondents ≥40 years was 22 after 6 months (0.58‰) and 41 after 12 months (1.1‰). Among non-respondents, a total of 27 new cases of lung cancer were registered after 6 months (0.85‰) and 58 new cases after 12 months (1.83‰).

The PPV for being diagnosed with lung cancer in the 6-month follow-up period was 0.2 (95% CI: 0.1–0.3) for individuals reporting at least one of the specific alarm symptoms. The PPV was similar in the 12-month follow-up period. Among the specific alarm symptoms, the highest PPVs were found for dyspnoea, with a PPV of 0.2 (95% CI: 0.1–0.4) and LR+ of 2.8 (95% CI: 1.3–6.0) in the 6-month follow-up (Table 2), and hoarseness, with a PPV of 0.3 (95% CI: 0.1–0.7) and LR+ of 2.6 (95% CI: 1.0–6.6) in the 12-month follow-up (Table 3). Among the nonspecific symptoms, the highest PPVs were found for loss of appetite, with a PPV of 0.3 (95% CI: 0.1–0.6) and LR+ of 4.8 (95% CI: 2.2–10.3) in the 6-month follow-up (Table 2), and a PPV of 0.3 (95% CI: 0.1–0.6) and LR+ of 2.5 (95% CI: 1.1–5.8) in the 12-month follow-up (Table 3).

Table 2 Positive and negative predictive values (PPV and NPV) and positive and negative likelihood ratios (LR+ and LR−) for lung cancer diagnosis in six-month follow-up period after study questionnaire.
Table 3 Positive and negative predictive values (PPV and NPV) and positive and negative likelihood ratios (LR+ and LR−) for lung cancer diagnosis in 12-month follow-up period after study questionnaire.

Contacting a GP regarding at least one of the specific symptoms had a PPV of 0.1 (95% CI: 0.0–0.4). In the 12-month follow-up, being a current heavy smoker carried in itself a PPV of 0.3 (95% CI: 0.1–0.6).

The NPVs were all close to 100, and the LR− values were predominantly close to 1. The highest LR− was 2.7 (95% CI: 1.7–4.3) in the 6-month follow-up for not experiencing an alarm symptom.

Discussion

In this study, we analysed the predictive values and likelihood ratios of specific and nonspecific alarm symptoms of lung cancer reported by a large sample of the Danish general population ≥40 years. The PPVs were generally very low. Respondents experiencing loss of appetite or dyspnoea or hoarseness for more than 4 weeks had the highest risk of subsequent lung cancer diagnosis. These findings indicate that identifying lung cancer patients based on specific alarm symptoms is challenging, and that nonspecific alarm symptoms could play an important role in identifying individuals at risk.

The NPVs were high, all close to 100, and the highest LR− was for not reporting an alarm symptom. This means that respondents not experiencing any alarm symptoms had a very high chance of not being subsequently diagnosed with lung cancer.

The prospective cohort design is a major strength of this study, as it provides the opportunity to obtain information about pre-diagnostic symptom experiences. The prospective design minimises the risk of recall bias, which is often a substantial challenge in studies of symptoms among cancer patients prior to diagnosis. Identifying lung cancer cases in the Danish Cancer Registry, rather than asking survey respondents, further reduces the risk of recall bias. This registry is based on mandatory data from multiple sources, and is considered a valid source of information on cancer diagnoses19.

Another strength is that the study is large-scale and nationwide with a random selection of individuals invited. However, individuals with many symptoms or those who have made contact with their GP multiple times may be more motivated to participate in a survey regarding symptoms and healthcare seeking. This could lead to an overestimation of symptom prevalence. On the other hand, it is plausible that persons experiencing several symptoms and undergoing numerous healthcare visits might not have surplus energy to complete the rather comprehensive questionnaire. Therefore, both over- and underestimation of symptom experiences are possible limitations to our study.

The rather high response rate of 54.2% amongst the eligible individuals ≥40 years is a strength, but it is important to keep in mind that differences between respondents and non-respondents might have affected the results. The number of incident lung cancer cases was higher among non-respondents compared with respondents (1.83‰ vs. 1.1‰). The difference might be explained by socioeconomic disparity between respondents and non-respondents. Low socioeconomic status is a risk factor for lung cancer even when adjusting for smoking status20,21: respondents were younger, more often of Danish ethnicity and had higher socioeconomic status than non-respondents. Therefore, the predictive values of lung cancer alarm symptoms reported here might not be generalisable to patients with low socioeconomic status.

We chose to include all incident lung cancer cases in both a 6-month and a 12-month follow-up period after reporting one or more of the respiratory alarm symptoms. We chose so to enhance the likelihood of the symptom being linked to the subsequently diagnosed lung cancer. A longer follow-up period would have increased the number of incident lung cancer cases, but would have also weakened the link between symptom experience and subsequent diagnosis.

A general weakness of surveys is that questionnaires may not measure precisely what they are designed to measure. To ensure that the respondents interpreted the questions and answer categories as intended, we conducted numerous series of validation, pilot testing and field testing prior to survey launch22. Based on the results of the pilot testing, it is reasonable to assume that the respondents understood and answered the questions as anticipated. Although the survey comprised questions about symptom experiences within a short time period (the preceding 4 weeks), some memory or recall bias cannot be ruled out.

We chose the lung cancer symptoms based on literature review and symptoms mentioned in lung cancer diagnostic pathways. However, other symptoms and characteristics such as pain or recent pneumonia can be signs of lung cancer as well and could have been included in the analyses. Unfortunately, we did not have access to such information.

Compared with previous studies on symptom prevalence in the general population, we found that the response rate in our study was comparable or even higher23,24. Similar to the findings of this study, we found that the PPVs of gastrointestinal alarm symptoms are low in other DaSC studies14,15. Compared with a study on respiratory symptoms in a general population in Britain25, we found lower rates of cough and dyspnoea. This could probably be explained by the fact that the study by Walabyeki et al. was conducted among individuals aged 50 and older, and had a higher proportion of current smokers (25.8% vs. 20% in our study). Furthermore, their study was carried out between October and March, and our study in June to December22. Therefore, a substantial portion of the symptoms recorded in Walabyeki et al. can be assumed to have been caused by seasonal respiratory tract infections25.

The most common symptoms of lung cancer are cough, dyspnoea and hemoptysis26, but It has been demonstrated that the predictive value of cough, dyspnoea and general symptoms that might be indicative of lung cancer is rather low (0.4–1.1%)27. Our study adds to this knowledge, suggesting that similar symptoms in the general population have even lower predictive value. This suggests that individuals deciding to contact a GP regarding a respiratory alarm symptom have a higher risk of underlying lung cancer.

Most lung cancer patients presents with multiple symptoms—both respiratory and constitutional9, and in a lung cancer prediction model loss of appetite and having less strength were found the strongest predictors28. This is in line with our findings of respondents with nonspecific symptoms, especially loss of appetite, having the highest risk of subsequent lung cancer.

Among the specific alarm symptoms, it has been shown that hemoptysis is the strongest predictor of subsequent lung cancer diagnosis2,29,30,31. However, as observed in our study, hemoptysis is a quite infrequent symptom; due to Danish rules on data protection, we are unable to report data for less than three persons, meaning that we were unable to calculate predictive values for hemoptysis in some categories of analysis.

We found that the PPVs and LR+ for alarm symptoms of lung cancer are quite low in the general population. Further insights could be obtained by focusing in the future on the predictive values of different combinations of specific and nonspecific alarm symptoms and signs. Moreover, combining laboratory results with symptom clusters could provide more information about determinants for cancer. However, this would require inclusion of an even larger population.

Quantification of the diagnostic value of symptoms in lung cancer detection is useful knowledge for clinicians. For the GP reviewing many patient contacts regarding symptoms that might be indicative of lung cancer, it is important knowledge that the risk from respiratory alarm symptoms is low, and that numerous referrals of patients with such symptoms may occur without revealing any cases of lung cancer. It is valuable for the GP to be able to inform patients when referring for lung cancer investigation that even with specific symptoms the risk of lung cancer is very small.

Furthermore, it is noteworthy that nonspecific symptoms such as loss of appetite might be as indicative of lung cancer as specific respiratory symptoms can be. For the clinician, it might be natural to investigate gastrointestinal symptoms in patients reporting loss of appetite, for example; however, our findings suggest that a supplementary investigation of lung cancer may also be worthwhile for such patients. Nonspecific symptoms may trigger the GP’s intuition of serious disease, and our results underline that nonspecific symptoms should be taken seriously. On the other hand, it is also valuable knowledge for GPs that NPVs are high, i.e., there is a very low risk of overlooking lung cancer in the absence of alarm symptoms.

Similarly, our results could be useful for health service planning. As the prognosis for lung cancer depends on stage at diagnosis, a logical intervention to improve survival could be to increase public knowledge of lung cancer symptoms and to encourage visiting a GP. Earlier presentation of symptoms to the GP and earlier referrals for symptom investigation could result in the earlier detection of some cancers4,5. However, health service planners need to be aware that as a diagnostic tool, symptoms are not as precise as pathology results due to the subjective and individual nature of symptoms. Given the low predictive value of specific and nonspecific alarm symptoms and the consequences of cancer, both the costs and capacity of the healthcare system should be kept in mind.

Furthermore, the predictive values of nonspecific symptoms demonstrate that referral guidelines and fast-track investigation programs should not be overzealous or based solely on specific respiratory symptoms but possibly also on clusters of symptoms and signs. Referral options should also include patients with nonspecific symptoms according to the GP’s intuition, since this factor is intangible but proven to be a strong predictor of cancer diagnosis32.

Methods

Study design and population

The study is a part of the Danish Symptom Cohort (DaSC), a nationwide cohort study comprising survey and health register data of a randomly selected cohort of adults in the general population of Denmark22. A total of 100,000 adults (≥20 years) randomly selected from the Civil Registration System were invited by mail to participate in a survey about symptom experiences. The letter included a unique log-in for a secure webpage with a comprehensive web-based questionnaire. People with no internet access were offered the survey by telephone interview. The survey was conducted from June to December 2012.

Questionnaire

The questionnaire comprised items about several symptoms, respondents’ reactions in response to those symptoms, lifestyle factors and general health beliefs and behaviour. This paper comprises the specific and nonspecific lung cancer alarm symptoms (Table 4). The symptoms were selected based on a review of the literature, including national and international guidelines7,9,17,33. Other studies derived from DaSC have reported prevalence estimates and predictive values of cancer alarm symptom related to other types of cancer10,11,13,14,15.

Table 4 Specific respiratory and nonspecific alarm symptoms.

The respondents were asked whether they had experienced one or more of the specific and nonspecific alarm symptoms: “Have you experienced any of the following sensations, symptoms or discomfort within the past 4 weeks?” If respondents confirmed a symptom experience, they were subsequently asked when they had experienced the symptom for the first time. In order to be classified as a respiratory alarm symptom, symptoms of cough and hoarseness had to persist for >4 weeks; other symptoms had to be present within the 4 weeks prior to the survey, but not necessarily for >4 weeks. The respondents were then asked: “Have you contacted your general practitioner concerning the symptom(s) you have experienced within the preceding 4 weeks, by appointment, telephone or e-mail?”

Further information on respondents’ smoking status was obtained through the questionnaire and the respondents were categorised into those who had never smoked, former smokers and current smokers. Current smokers were subdivided into light or heavy smokers: heavy smoking was defined as reported daily tobacco use equivalent to >15 cigarettes. One cheroot was equivalent to three cigarettes, and one cigar or one pipe was equivalent to five cigarettes.

Register data

Information on sex and age of the invited individuals was obtained through the Civil Registration System. Each individual is assigned a unique identification number at birth or upon obtaining a residency permit in Denmark, thus enabling anonymized, individual-level data to be linked between healthcare registers and socioeconomic registers34. This study comprises information from respondents aged 40 or older, due to the extremely low incidence of lung cancer among younger patients35.

Diagnoses of lung cancer (ICD-10-CM Code C34 Malignant neoplasm of bronchus and lung) were retrieved from the Danish Cancer Registry19, which comprises information about all incident cancer cases in Denmark, including date of diagnosis and ICD-10 codes for malignancy subtype and stage at diagnosis. Only cases of lung cancer diagnosed within a 6-month and 12-month period after questionnaire completion were included in this study. For non-respondents, we also identified the number of lung cancer cases diagnosed within 6- and 12-month periods after invitation to the survey. Both respondent and non-respondent cases were excluded if the individual had been diagnosed with the same cancer (ICD-10 code) within a period of 5 years prior to completion of the questionnaire/invitation.

Data on cohabitation status, educational level, labour market affiliation, income and ethnicity were obtained from nationwide socioeconomic registers36,37,38 to analyse possible socioeconomic disparities between respondents and non-respondents.

Statistical analysis

The positive predictive value (PPV) for each alarm symptom was calculated by dividing the number of symptomatic respondents subsequently diagnosed with lung cancer by the total number of symptomatic respondents in each category. Negative predictive values (NPVs) were calculated by dividing the number of asymptomatic respondents not subsequently diagnosed with lung cancer by the total number of asymptomatic respondents. The predictive values are presented as percentages. PPVs and NPVs for lung cancer were calculated for each of the symptoms, for at least one of the symptoms, for reported contact with a GP regarding at least one of the symptoms, and reported smoking status. Besides PPVs and NPVs, the positive likelihood ratios (LR+) and negative likelihood ratios (LR−) of the association between symptom experience, GP contact and lung cancer were calculated. These relative ratios were calculated because we expected that the incidence of alarm symptoms would be much higher than the incidence of lung cancer, hence the association between symptom experience and lung cancer would be attenuated by only calculating PPVs and NPVs.

Confidence intervals were calculated using a binomial distribution. All statistical tests used a significance level of P < 0.05. Data analysis was conducted using STATA 13.1 statistical software (StataCorp, College Station, TX, USA).

Ethics approval and consent to participate

The Regional Scientific Ethics Committee for Southern Denmark evaluated the project and concluded that it could be implemented without permission from the Regional Scientific Ethical Committee for Southern Denmark, according to Danish law. Informed consent was obtained from the respondents, and answering the questionnaire was completely voluntary and unpaid. The participants in the study were clearly informed that there would be no clinical follow-up and that they should contact their own GP in case of concern or worry. The project was also approved by the Danish Data Protection Agency (journal number 2011-41-6651).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.