Introduction

Chronic obstructive pulmonary disease (COPD) is an under-diagnosed and under-treated condition.14 One of the main reasons for this is poor access to spirometers.1 In the USA, a validated questionnaire (the COPD Population Screener, COPD-PS) with five questions, including respiratory symptoms and risk factors for COPD, has been used to identify patients requiring confirmatory spirometry (http://www.copd.org/screening/survey). While this two-step strategy for COPD detection reduces the proportion of unnecessary spirometry tests, the COPD-PS has only 61% specificity at the recommended cut-off value5 which means that 39% of those without COPD are sent for diagnostic quality spirometry.

Peak expiratory flow (PEF) can be used to detect severe airflow obstruction, according to data from the population-based multicentre Burden of Obstructive Lung Disease and Proyecto LatinoAmericano de Investigación en Obstrucción Pulmonar (PLATINO) studies.6,7 The Burden of Obstructive Lung Disease study investigators found that performing PEF for smokers aged >40 and then only sending those with a low PEF for post-bronchodilator (BD) spirometry reduced the need for diagnostic quality spirometry to 12% of the screened population.6 However, PEF was obtained with a spirometer whereas more widely available inexpensive mechanical peak flow meters are known to be less accurate and reproducible compared with diagnostic quality spirometers.6,8 A similar strategy has been tested with other pocket spirometers in selected populations.9,10

A 6-second spirometry test has been proposed as a simplified alternative to a more prolonged forced vital capacity (FVC) manoeuvre.1119 Although current criteria to diagnose airflow obstruction and COPD are based on forced expiratory volume in 1s (FEV1)/FVC, FEV1/forced expiratory volume in 6s (FEV6) is almost equivalent, simpler, and possibly more specific.20 Several inexpensive electronic devices (also known as pocket spirometers) measure FEV1 and FEV6, an improvement over PEF measurements.2130

Detecting COPD in primary care settings through a three-step strategy consisting of (1) identifying adults with a higher likelihood of having COPD using a questionnaire; (2) performing pocket spirometry; and (3) sending only those positive on pocket spirometry for formal spirometry can be more efficient than only employing a questionnaire or pocket spirometry. We evaluated the performance of this three-step COPD screening strategy in a representative sample of Mexico City residents by determining the proportion of the total population requiring diagnostic quality pre- and post-BD spirometry as well the proportion of undetected COPD cases.

Materials and Methods

Data were obtained from a cross-sectional survey conducted in 2010 in a multi-stage cluster sample of Mexico City residents aged 40 years. Since the analysed data did not include all questions used in the COPD-PS, we also present the results of an analysis to develop a COPD scale using a related sample conducted in 2003 as part of the PLATINO study, whose methods have been described in detail elsewhere.3133 The 2010 survey was conducted on the same households as those selected for the PLATINO study and included all residents aged 40, the majority of whom had also participated in the PLATINO study 7 years earlier. The study protocol for the 2010 survey was approved by our institution’s ethics committee, with all participants providing signed informed consent.

Interviewers applied the same structured questionnaire in both surveys (PLATINO, http://www.PLATINO-alat.org). Pre-BD and post-BD spirometry was also performed by trained technicians for each participant at his/her household using portable spirometers of diagnostic quality (EasyOne, NDD Medical Technologies, Zurich, Switzerland) and following the American Thoracic Society quality criteria.34,35 All spirometry tests in the PLATINO study were performed using the same model of spirometer. For the 2010 survey we used the EasyOne spirometer connected to a laptop computer (ndd Easy on PC) for 82% of the study participants. Participants in the 2010 survey also performed three 6-second expiratory manoeuvres with a turbine-based pocket spirometer (COPD-6 model 4000; Vitalograph, Ennis, Ireland). The highest FEV1 and FEV6 values were independently selected from the three available measurements and used for analysis. To assess the impact of training or fatigue on the measurements, participants were randomly allocated to first perform either the pre-BD diagnostic spirometry test or the pre-BD 6-second spirometry test.

Because of the lack of agreement among experts, three common definitions of COPD were considered: (1) post-BD FEV1/FVC <0.70 (Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages 1–4);36 (2) the more specific GOLD stages 2–4 (post-BD FEV1/FVC <0.70 with FEV1 <80% predicted);20,37,38 and (3) FEV1/FVC below the lower limit of normal (LLN). We used predicted values for FEV1/FVC derived from the PLATINO study.30

Using the 2003 Mexico City PLATINO dataset as a training sample, we fitted unconditional logistic regression models39 to develop a questionnaire-based scale for predicting the log odds of COPD (GOLD stages 1–4) as a function of the following available predictors: age, sex, the presence of cough and phlegm on most days for at least 3 months per year, medical diagnosis of asthma and tuberculosis, pack-years of smoking, and exposure to dust in the workplace and to wood smoke from cooking. Age and pack-years of smoking were initially entered into the logistic model and then the other potential predictors were entered individually to assess whether they had a statistically significant associated coefficient and whether they improved the logistic area under the curve (AUC). Scores for the COPD scale were obtained by multiplying the logistic coefficient associated with each predictor by 10 and rounding off the result to the nearest integer.

Non-parametric receiver operating characteristic (ROC) curves were used to assess the discriminatory power for COPD detection of both the new COPD scale and the raw FEV1/FEV6 from the pocket spirometer to identify appropriate cut-off values. Additionally, Youden’s J statistic,40 with equal weights for sensitivity and specificity, was calculated as a measure of test performance. Combined sensitivity and specificity were calculated for a screening strategy consisting of first applying the developed COPD scale and then performing the 6-second spirometry results in those with a COPD scale score indicating COPD. For this strategy, estimates of the proportion of the total screened population requiring confirmatory (diagnostic quality) spirometry and of its combined positive predictive value were obtained for a range of COPD prevalence values using Bayes’ theorem.

Agreement between the measurements obtained using the spirometers and the pocket spirometer was assessed with the intraclass correlation coefficient41 and the Bland–Altman plot.42 All analyses considered the sample design and were performed using STATA software V.12 (StataCorp, College Station, TX, USA).

Results

The results presented here do not consider either the type of spirometer used or the order in which the 6-second spirometry and diagnostic quality spirometry tests were performed since these two factors had only a marginal influence on our results. The 2003 Mexico City database contained information on 956 participants aged 40 years with complete information on key variables (Figure 1). Of these, 542 were not included in the 2010 survey and constitute the training sample in which we developed the COPD scale.

Figure 1
figure 1

Venn diagram showing the analysed Mexico City samples.

In the 2010 Mexico City survey we identified 1,040 eligible subjects, 737 of whom completed the questionnaires and pre-BD spirometry. Post-BD spirometry with complete questionnaires and 6-second spirometry tests were available for 659 individuals (63.4% of those eligible), 414 of whom had also participated in the 2003 PLATINO survey. These individuals constitute the sample in which we validated our developed COPD scale.

Summary measures for relevant variables are presented in Table 1 for the two analysed samples. Participants in the 2010 survey were slightly older than those in the 2003 training sample. In both surveys about 40% of participants were men. The average height was similar in both samples but the average weight and the prevalence of obesity was higher in the 2010 sample. The prevalence of respiratory symptoms, medically diagnosed asthma (5%) and tuberculosis (<1%) as well as the exposure to tobacco and wood smoke were similar in the two samples. In contrast, participants in the 2010 subsample were less likely to report physician-diagnosed COPD (0.8%) than those included in the 2003 sample (2.4%).

Table 1 Characteristics of participants with post-bronchodilator spirometry in the two analysed samples

Exposure to a dusty job for more than 1 year was less frequently reported in the 2010 subsample than in the 2003 sample (38.5 vs. 46.7%). The odds of having GOLD stages 1–4 was 2.6 times lower in the 2010 sample compared with the 2003 sample (P<0.05, see Supplementary Table S1). This difference persisted after adjusting for age and pack-years of smoking. Although not statistically significant at P<0.05, the odds of having GOLD stages 2–4 and FEV1/FEV6 <LLN were also lower in the 2010 sample (2.0 and 1.6 times lower than in 2003).

Table 2 shows the COPD scale obtained by fitting a logistic regression model to predict COPD GOLD stages 1–4 on the 2003 training database. As expected, COPD prevalence, so defined, increased as age and pack-years of cigarette smoking increased. None of the other variables shown in Table 1 added more information to the prediction (see Supplementary Table S2). The total COPD scale score ranged between 0 and 34.

Table 2 Logistic regression modela to predict the presence of GOLD stages 1–4 as a function of age and pack-years of smoking in the 2003 Mexico City PLATINO Study subsample (n=542)

The developed COPD scale had an acceptable performance for predicting GOLD stages 1–4 as assessed by the area under the ROC curve (0.77) (Figure 2). The scale shows lower areas under the curve for predicting GOLD stages 2–4 and COPD defined as FEV1/FEV6 <LLN (0.71 and 0.64, respectively). Nevertheless, at the chosen cut-off score of 10 or more, the developed COPD scale was able to identify correctly 82% of cases with GOLD stages 1–4, 79% of those meeting the GOLD definition for stages 2–4, and 68% of those with FEV1/FEV6 <LLN. On the other hand, the scale specificity ranged from 60% when the GOLD stages 1–4 definition was employed to 46% when the FEV1/FEV6 <LLN definition was used.

Figure 2
figure 2

ROC curves for detecting COPD, as defined by three criteria ((a) GOLD stages 1–4; (b) GOLD stages 2–4; (c) FEV1/FEV6<LLN), using the COPD scale developed from the 2003 Mexico City PLATINO subsample. Numbers close to the curve indicate the selected cut point. Numbers in parenthesis correspond to the 95% confidence intervals for the area under the curve.

Compared with formal pre-BD spirometry, the pocket spirometer gave similar FEV1 averages in the tested participants of the 2010 survey but lower FEV6 readings (Table 3). Consequently, the predicted mean FEV1/FEV6 in the tested population was higher when measured by the pocket spirometer (107.1 vs. 100.8%, respectively). The 6-second spirometry test also showed, on average, higher intratest coefficients of variation than those observed in pre-BD spirometry. As a result, only 71% of the 6-second spirometry tests had the equivalent of grade A quality compared with approximately 80% of pre-BD spirometry tests. All these differences were statistically significant at P<0.05.

Table 3 Spirometry parameters and variability indicators for pre-BD spirometry and 6-SS test in the Mexico City 2010 survey sample (n=737)

FEV1/FEV6 values obtained from the pocket spirometer had a poor intraclass correlation: 0.26 comparing 6-second spirometry results with pre-BD spirometry (95% confidence interval, 0.19–0.33, mean difference +5.1%) and 0.34 comparing 6-second spirometry results with post-BD diagnostic spirometry (95% confidence interval, 0.27–0.41, mean difference +3.7%, see Figure 3). The Bland–Altman plot showed that extreme differences in FEV1/FEV6 between the two devices (more than twice the s.d.) were frequent (Figure 3). In addition, the pocket spirometer produced lower values for this ratio at values of 90% and higher. This latter finding was also observed in a laboratory linearity check comparing COPD-6 measurements with a flow-volume calibrator (Flow-Volume Calibrator; Jones Medical Instruments, Oak Brook, IL, USA).

Figure 3
figure 3

Bland–Altman plot comparing: (a) the FEV1/FEV6 (%) obtained from Vitalograph COPD-6 6-SS with that obtained from pre-bronchodilator spirometry, (b) FEV1/FEV6 (%) from COPD-6 with that obtained in the laboratory from the flow-volume calibration syringe. Horizontal dotted lines at about 10 and 20 indicate the limits of agreement of the % FEV1/FEV6 difference (within 2 standard deviations of the mean difference). The line over the points corresponds to the median band of the % FEV1/FEV6 difference.

Figure 4 shows the ROC curves for detecting COPD using the pocket spirometer. Not surprisingly, the discriminatory power was better for detecting FEV1/FEV6 <LLN (AUC 0.88) than for detecting GOLD stages 1–4 (AUC 0.86) and GOLD stages 2–4 (AUC 0.85).

Figure 4
figure 4

ROC curves for detecting COPD, as defined by three criteria ((a) GOLD stages 1–4; (b) GOLD stages 2–4; (c) FEV1/FEV6<LLN), using the Vitalograph COPD-6 6-SS in the 2010 Mexico City survey sample. Numbers close to the curve indicate different cut points for the FEV1/FEV6 %. Numbers in parenthesis correspond to the 95% confidence intervals for the area under the curve.

Table 4 shows sensitivity and specificity estimates at chosen cut-off points obtained using the 2010 dataset. In this sample the COPD scale, positive at a score of 10, correctly identified 92% of participants with GOLD stages 1–4 and 2–4, and 82% of those with FEV1/FEV6 <LLN. Specificity values of this scale were similar for the three COPD definitions (close to 47%).

Table 4 Accuracy of the COPD scale, the Vitalograph 6-SS test and the combination of both for COPD detection at indicated cut-off points observed in the 2010 Mexico City survey sample

Youden’s J statistic shows that the pocket spirometer—with values ranging between 0.62 and 0.65 depending on the COPD definition used—is a more informative screening test than the COPD scale with values ranging between 0.29 and 0.39. The pocket spirometer was able to detect almost 80% of those with GOLD stages 1–4 and nearly 75% of those with GOLD stages 2–4 or with FEV1/FEV6 <LLN using a raw FEV1/FEV6 <0.80 to identify those with post-BD airway obstruction.

On the other hand, the specificity of the pocket spirometer would be higher than that found on the COPD scale (close to 85% for any of the COPD definitions used). If all subjects with a score of 10 on the COPD scale were tested with the pocket spirometer, two-thirds of the COPD cases (using any COPD definition) would test positive (FEV1/FEV6 <0.80). In addition, under this serial testing scenario, the combined specificity would be close to 90% (also using any COPD definition). Youden’s J values for serial testing ranged between 0.53 and 0.60, not far from those observed for the pocket spirometer alone.

Figure 5 summarises several performance parameters of the serial testing with a range of prevalence values which are similar to those observed for the five Latin American cities included in the PLATINO study. Between 35 and 48% would not proceed to the second step. If those positive on the COPD scale performed a 6-second spirometry test, the percentage of the total population requiring all three screening steps including confirmatory spirometry would range from 10 to 20%.

Figure 5
figure 5

Projected percent of the total population positive to the COPD scale, combined positive predictive value and percent of the total population requiring confirmatory spirometry under serial screening with COPD scale and 6-SS, calculated for a range of COPD prevalence values (with 3 COPD definitions, graph a, b, c). Grey lines around means correspond to 95% confidence intervals.

Discussion

Main findings

Using logistic regression, we fitted a parsimonious model with only two predictors, age and pack-years of smoking, which are easily obtained. The COPD scale derived from this model had a reasonable accuracy (AUC 0.64–0.77) depending on the COPD definition used.

According to our results in the validation sample, we chose a score of 10 on the COPD scale as the threshold of COPD risk during the first screening stage, which produced a high sensitivity (82–92%) at the cost of a specificity of about 47%. This threshold excludes about half of the population from further screening, missing only 8–18% of mostly mild COPD cases who would presumably have many years of subsequent screening before developing clinically significant COPD, even if they continued smoking and were susceptible to developing COPD.

The pocket spirometer was found to have higher variability in measuring FEV1/FEV6 than formal spirometry. This was probably due to the fact that only three expiratory manoeuvres were done, regardless of the repeatability of the results. We also found that the raw FEV1/FEV6 had better discriminatory power than the percentage predicted FEV1/FEV6, with areas under the ROC curve 0.02–0.05 higher. Using a cut-off point of <0.80, the raw FEV1/FEV6 sensitivity was 75–80% with a specificity of about 85%.

Combining the COPD scale with pocket spirometry in serial testing enables the detection of two-thirds of COPD cases by performing formal spirometry in a small fraction (10–20%) of the total screened population. We found that the simplified spirometry by itself had a marginally better test performance than if performed sequentially after a COPD scale. However, acceptance of the simplified spirometry would probably be higher in those found to have an increased likelihood of having COPD by the COPD scale (mainly older smokers), with the additional advantage of reducing the number of pocket spirometry tests needed—an important concern in settings with limited resources.

Interpretation of findings in relation to previously published work

Other recently published COPD scales5,43 have better sensitivity and specificity than the one we present here. However, our scale can be easily translated into target populations requiring further COPD screening: (1) adults aged 40–49 years with 10 pack-years of smoking; (2) ever smokers aged 50–59 years; and (3) all adults aged 60 years.

In a recent Spanish COPD-6 validation study44 the cut-off point of <0.80 for the raw FEV1/FEV6 was also found to have better sensitivity and specificity than the cut-off point of <0.70 recommended by the manufacturer. Using only the percentage predicted FEV1 from the Vitalograph COPD-6 screening device could have better discriminatory power than the raw FEV1/FEV6 for detecting cases of severe COPD, but our sample size was too small to exclude random error from this comparison (see Supplementary Figure S1).

The prevalence of COPD or poorly reversible airflow obstruction depends critically on the criteria used.20,37 The GOLD stage 1 definition has a high false positive rate in older people. Using GOLD stage 2 (requiring a low FEV1) or using FEV1/FVC <LLN provides more specificity.20,36 Part of the significant variation in the prevalence of COPD in different cities, which may be higher than three-fold based on GOLD stages 1–4 criteria,45,46 may be due to differences in the quality of spirometry tests from one technologist to another. Some technologists urge participants to blow out for many seconds which reduces the measured FEV1/FVC, thus increasing the estimated prevalence of COPD.47 Since the best COPD definition is still a controversial issue, we decided to use three alternative definitions of COPD. Interestingly, our results indicate that the three-step screening strategy was relatively insensitive to the COPD definition employed.

We observed an important reduction in the prevalence of COPD using GOLD 1–4 criteria between the 2010 and 2003 samples and a smaller decrease using GOLD 2–4 and FEV1/FEV6 <LLN criteria, which was not explained by changes in age or pack-years of smoking. Mean forced expiratory time in the 2010 sample was 1.5 s shorter than in the 2003 sample (P<0.05), which tends to reduce the prevalence of airflow obstruction based on FEV1/FVC criteria but not that derived from FEV1/FEV6.47

Strengths and limitations of this study

We studied a population sample with an unbiased distribution of risk factors, symptoms, and lung function. Limitations of our study include a relatively small sample size and COPD prevalence (the lowest prevalence in the Burden of Obstructive Lung Disease and PLATINO studies, under any definition). As a consequence, the precision of our estimates of the accuracy of the tested COPD screening instruments may not be optimal, and we could not analyse in detail the performance of the screening instruments by COPD severity. In addition, the results of our population-based study may not apply to patients seen in primary care settings where a higher pre-test prevalence of disease or a higher proportion of persons aged 60 years is expected.

Implications for future research, policy and practice

No screening strategy is ideal. Diagnostic quality post-BD spirometry, the gold standard for COPD diagnosis, cannot usually be applied to the whole population and is unavailable in primary care, even in developed countries. Minimising the number of these tests should therefore be part of the screening strategy, even at the cost of missing a reasonable proportion of COPD cases, fortunately most of them with mild disease. This can be achieved more efficiently by combining information from a questionnaire and pocket spirometry.

The inclusion in the screening for COPD of a simplified lung function test performed with a low cost pocket spirometer adds objective lung function testing to the screening process, thus helping to solve two important problems in COPD detection: (1) a high rate of false positive diagnoses (up to 50% of the total population) if only a questionnaire is used for screening; and (2) the identification of a significant proportion of undiagnosed patients with poorly reversible airflow obstruction (COPD or asthma) who are not receiving appropriate treatment. We chose FEV1/FEV6 over PEF or FEV1 because it is more specific for airway obstruction whereas PEF and FEV1 are also reduced by conditions causing restriction of lung volumes as well as sub-maximal inhalation at the beginning of the forced exhalation. In addition, 6-second spirometry compares volumes (FEV1 and FEV6) at fixed times of the expiratory manoeuvre and avoids inconsistencies due to changes in the expiratory time across different centres or over time, an important concern for the generalisability of COPD screening in primary care. Additional information is needed from a clinical study performed in a primary care setting using this three-step screening strategy.

Conclusions

Our results indicate that screening for COPD is best achieved in terms of yield and cost by a three-step scheme that starts with a few simple questions, then a pocket spirometry test in those with a higher risk of COPD, therefore restricting spirometry of diagnostic quality only to those with a low FEV1/FEV6.