Accuracy of Vitalograph lung monitor as a screening test for COPD in primary care

Microspirometry may be useful as the second stage of a screening pathway among patients reporting respiratory symptoms. We assessed sensitivity and specificity of the Vitalograph® lung monitor compared with post-bronchodilator confirmatory spirometry (ndd Easy on-PC) among primary care chronic obstructive pulmonary disease (COPD) patients within the Birmingham COPD cohort. We report a case–control analysis within 71 general practices in the UK. Eligible patients were aged ≥40 years who were either on a clinical COPD register or reported chronic respiratory symptoms on a questionnaire. Participants performed pre- and post-bronchodilator microspirometry, prior to confirmatory spirometry. Out of the 544 participants, COPD was confirmed in 337 according to post-bronchodilator confirmatory spirometry. Pre-bronchodilator, using the LLN as a cut-point, the lung monitor had a sensitivity of 50.5% (95% CI 45.0%, 55.9%) and a specificity of 99.0% (95% CI 96.6%, 99.9%) in our sample. Using a fixed ratio of FEV1/FEV6 < 0.7 to define obstruction in the lung monitor, sensitivity increased (58.8%; 95% CI 53.0, 63.8) while specificity was virtually identical (98.6%; 95% CI 95.8, 99.7). Within our sample, the optimal cut-point for the lung monitor was FEV1/FEV6 < 0.78, with sensitivity of 82.8% (95% CI 78.3%, 86.7%) and specificity of 85.0% (95% CI 79.4%, 89.6%). Test performance of the lung monitor was unaffected by bronchodilation. The lung monitor could be used in primary care without a bronchodilator using a simple ratio of FEV1/FEV6 as part of a screening pathway for COPD among patients reporting respiratory symptoms.


INTRODUCTION
Chronic obstructive pulmonary disease (COPD) is one of the most common long-term respiratory conditions with rising burden and mortality worldwide. [1][2][3] It is characterised by increasing breathlessness and decline in lung function, punctuated by episodes of acute exacerbations that often lead to hospital admission and result in poor prognosis and gradual deterioration of quality of life. 4 Annual healthcare and societal costs of COPD in Europe are estimated to be €48.4 billion. 5 Despite the high burden of disease, the large majority of patients with COPD remain undiagnosed 6 while experiencing significant morbidity, 7 resulting in calls to improve early diagnosis. 8,9 Early diagnosis could focus smoking cessation support and allow prescription of treatments that have been shown to reduce risk of exacerbation in those with COPD, thus has the potential to slow disease progression.
Screening programmes are not yet recommended, partly because of lack of evidence of the long-term benefits, 10,11 a view which is upheld in the most recent UK National Screening Committee report. 12 However, there are also uncertainties around the performance of available screening tests, including symptom or risk assessment questionnaires and lung function-based measures, alone or in combination. 12,13 A recent study compared different screening strategies among current smokers, against post-bronchodilator spirometry. This concluded that microspirometry or peak flow meters had the best performance, but interpretation was limited by a small sample size and low-quality spirometry data. 14 Microspirometers are small relatively inexpensive handheld devices that measure forced expiratory volume in 1 s (FEV 1 ) and in 6 s (FEV 6 ). While this is not a substitute for confirmatory spirometry, which is more time consuming and measures FEV 1 and forced vital capacity (FVC), usually after bronchodilation, the FEV 1 /FEV 6 ratio could be used as a pragmatic initial screening test to identify patients requiring confirmatory spirometry. Microspirometry can be undertaken in office settings and requires less time and patient effort. [15][16][17][18] Over the past decade, several studies have explored the accuracy of microspirometers in detecting airflow obstruction. 14,[19][20][21][22][23][24][25][26][27] However, none of the studies considered the use of microspirometers as the second stage of a screening pathway. Microspirometry as a screening tool is usually performed without bronchodilation, as this contributes to time savings and avoids the need for Salbutamol. However, it remains uncertain how microspirometry performance differs when conducted pre-and postbronchodilator. Finally, there is little consensus regarding the optimal FEV 1 /FEV 6 cut-point for referral to confirmatory spirometry, with recent studies suggesting ratios of <0.73, 22 <0.75 21 and <0.78. 19 To address the current evidence gaps, we conducted a study in primary care patients with existing respiratory symptoms, including those pre-screened in our linked trial. We aimed to assess the test performance of a microspirometer (Vitalograph Lung monitor) against confirmatory post-bronchodilator spirometry (ndd Easy on-PC) and explore the effect of using pre-or post-bronchodilator microspirometer data, the impact of using different airflow obstruction criteria and optimal cut-points.

RESULTS
Follow-up assessments were booked for 1633 participants. Out of the 1500 participants who attended the assessment, 551 took part in the case-control study. Lung monitor and spirometry test data were available for a total of 544 participants (Fig. 1).
Nearly half of the participants (45.5%) reported exacerbations in the past 12 months and 37 (6.8%) reported a respiratory hospitalisation in the past 2 years (Table 1). Cases reported approximately twice as many exacerbations as controls (54.2% vs 31.4%) in the past 12 months.
Lung monitor and confirmatory spirometry tests were also highly correlated for post-bronchodilator FEV 1 (r = 0.97; p < 0.001), with the Bland-Altman plot again demonstrating good agreement ( Supplementary Fig. 4). Comparison of post-bronchodilator FEV 6 again revealed high correlation (r = 0.97; p < 0.001), though agreement was lower, indicating that the lung monitor      5). Discriminatory accuracy of the post-bronchodilator lung monitor was identical to that based on pre-bronchodilator data (C = 0.90; 95% CI 0.87, 0.92; Supplementary Fig. 6).
Sensitivity analyses: optimal cut-points for lung monitor FEV 1 /FEV 6 ratio, relative to confirmatory spirometry FEV 1 /FVC < LLN In light of comparable test accuracy of the lung monitor based on pre-and post-bronchodilator data, we explored optimal FEV 1 /FEV 6 cut-points using pre-bronchodilator tests (Table 4).
In our sample, an FEV 1 /FEV 6 cut-point of <0.78 had the best overall test performance with sensitivity of 82.8% (95% CI 78.3%, 86.7) and specificity of 85.0% (95% CI 79.4%, 89.6%). Using this cut-point would result in the lung monitor only missing 17.2% of true positives and correctly identifying the majority of patients without the disease. Furthermore, this cut-point would result in 57% of those screened requiring confirmatory spirometry. The positive predictive value for a population COPD prevalence of 6% was estimated to be 26.1% (95% CI 25.0, 27.2) meaning that around one in four patients referred for confirmatory spirometry would result in a diagnosis.
The above pattern was broadly similar when analyses were repeated using the fixed ratio to define obstruction for confirmatory spirometry (FEV 1 /FVC < 0.7), though sensitivity was slightly lower at each cut-point and specificity remained at 100% until FEV 1 /FEV 6 > 0.7 (Supplementary Table 1). These analyses may reflect the test performance when using the simpler criterion for the lung monitor in countries defining airflow obstruction as FEV 1 / FVC < 0.7, such as the UK. 28

DISCUSSION
We found that the lung monitor has high discriminatory accuracy among patients with existing chronic respiratory symptoms. This supports its suitability, either alone or perhaps in combination with a symptom questionnaire, as a screening test prior to confirmatory spirometry. We further demonstrated that using a bronchodilator with the lung monitor as part of screening offers no performance advantage.
Importantly, the lung monitor demonstrated good test performance despite being delivered with minimal coaching and only requiring a maximum of three blows, rather than the possible six blows to achieve repeatability with confirmatory spirometry.
Using pre-bronchodilator FEV 1 /FEV 6 < LLN, the lung monitor missed half of COPD cases identified by FEV 1 /FVC < LLN from confirmatory spirometry but detected virtually all non-COPD cases correctly. When using pre-bronchodilator FEV 1 /FEV 6 < 0.70, the lung monitor detected a higher proportion of true positives, the same proportion of true negatives and the discriminatory accuracy remained constant (C = 0.90 vs C = 0.91). Given the added complexity of applying LLN to the lung monitor as it is not connected to computer software, it appears justifiable to apply an FEV 1 /FEV 6 fixed ratio to the lung monitor for purposes of screening, while maintaining the LLN for diagnosing and monitoring COPD. [29][30][31][32] Test performance varied considerably depending on the specified cut-point of the pre-bronchodilator FEV 1 /FEV 6 ratio. Our proposed optimal cut-point of <0.78 was similar to previous studies, which had suggested using cut-points of <0.75, 21 <0.78 19 and <0.80. 20 The sensitivity and specificity of the lung monitor in our sample was acceptable for a screening test, missing <20% of COPD cases, while 1 in 4 patients of the 57% referred for confirmatory spirometry were true positives and therefore would be eligible for diagnosis and relevant treatment. While FEV 1 /FEV 6 < 0.78 appeared the most efficient in our sample, if the lung monitor were to be used as a screening test the cut-point could be modified according to the balance of acceptable false negative rates and availability of resources.
We have assessed the screening test performance of one type of microspirometer. One factor affecting accuracy may be the Table 4. Screening accuracy of pre-bronchodilator lung monitor FEV 1 /FEV 6 cut-points, against post-BD confirmatory spirometry (FEV 1 /FVC < LLN different lung function indices being measured: FEV 6 by the lung monitor vs FVC by the ndd device (confirmatory spirometry). We assessed test performance of both devices using FEV 1 /FEV 6 < LLN as the cut-off for a positive result, relative to confirmatory spirometry FEV 1 /FVC < LLN. The ndd device had sensitivity of 80.4% and specificity of 98.1%, compared with the lung monitor sensitivity of 50.5% and specificity of 99.0%. This suggests that the difference in indices only partly affects performance. Another important difference to consider is the type of sensor used in the two devices for flow/volume measurement (turbine in the lung monitor vs ultrasonic in the ndd), as evidence suggests a degree of inaccuracy in turbine devices. 33,34 Our analysis sample had fewer controls than determined by our sample size calculation, containing 207 instead of 248. While the precision around specificity was reduced, the precision around sensitivity estimates was unaffected; the latter being arguably more important in the context of screening.
Using the LLN criteria to define cases in our primary analysis ensured an accurate assessment of lung function, without added 'noise' from misdiagnosed patients which can be introduced when using the FEV 1 /FVC < 0.7 ratio. 32 As the majority of previous microspirometry test accuracy studies used the fixed ratio definition of obstruction, [19][20][21]24,25,27,35 our study has made a valuable contribution to the evidence base.
Owing to the case-control study being nested within a larger COPD cohort study, the analysis sample had a higher prevalence of COPD and possibly more advanced disease than would be observed in an undiagnosed primary care population reporting respiratory symptoms. Therefore, our study is at potential risk of spectrum bias, as the reported sensitivities and specificities may not fully reflect the test performance of the lung monitor if used as a screening tool within symptomatic patients with lower prevalence of COPD. However, by using Bayes' Theorem the reported post-test estimates were based on current UK COPD prevalence of 3-10%, mitigating against this risk.
Nearly a third of our sample was a screened population, suggesting that our findings will resonate with potential screening processes, as patients could be selected for microspirometry on the basis of symptom-or risk-based screening tests. Furthermore, the fact that we included patients with chronic respiratory symptoms and a range of lung function severities means that our results may apply to an undiagnosed population with a similar symptom profile. In addition, our study was not restricted to eversmokers, unlike previous studies. 14,19,21,22,25 For practical reasons, the same researcher administered both the lung monitor and confirmatory spirometry. Although researchers only recorded raw FEV 1 and FEV 6 lung monitor values and did not calculate obstruction from this first test, it is possible that researchers were not entirely blind when administering the confirmatory spirometry to the patient. While this introduced a risk of review bias, this was minimised as researchers received standardised training to give only brief instruction for lung monitor tests and proper coaching for confirmatory spirometry.
Most previous studies have either used only pre-bronchodilator microspirometry [19][20][21][23][24][25][26] or post-bronchodilator microspirometry, 27 and the only study to measure pre-and post-bronchodilator microspirometry did not report comparative test accuracy. 22 By demonstrating the comparability of test performance irrespective of bronchodilation, our study supports the continued use of prebronchodilator microspirometers for screening purposes.
Participants performed three blows using the lung monitor, irrespective of blow quality as indicated by the device's in-built quality alert, with the highest recorded readings being used for analyses. While this follows some previous studies, 23,25,26 had we required all lung monitor blows to be technically valid 19,20,22 we may have obtained greater FEV 1 or FEV 6 values for some participants. Furthermore, like most studies we did not assess within-participant repeatability across blows on the lung monitor, though this has been done in at least one study. 21 The observed test performance of the lung monitor suggests that it could be reliably used as a screening tool in patients perceived to be at risk of COPD, to select those requiring confirmatory spirometry. The efficiency of the diagnostic spirometry test could therefore be substantially increased, by patients highly unlikely to have airflow obstruction being screened out in advance. Screening at-risk symptomatic patients with a lung monitor rather than referring all patients for confirmatory spirometry also represents financial savings, with the handheld device being approximately one tenth of the cost of diagnostic spirometers. Resource savings could be realised in practices irrespective of whether they conduct confirmatory spirometry 'in house' or refer patients to a lung function unit, as both models would reduce the number of patients performing this diagnostic test.
The ability to use a fixed ratio for the lung monitor rather than the LLN to assess airflow obstruction represents a time saving for clinicians, who would otherwise need to use software to refer to reference equations. The comparable test performance of the lung monitor irrespective of bronchodilation supports the use of prebronchodilator tests, further contributing to the efficiency and ease of the screening test, a key consideration in the context of time-pressured primary care consultations.
The lung monitor could potentially be administered by any member of a primary care team, as it is a simple device requiring minimal training. This would be beneficial in general practice where staff may be unfamiliar with the device 19 and the simplicity may minimise the risk of becoming de-skilled in using the lung monitor, in contrast to confirmatory spirometry where clinicians' skills can reduce over time if they do not perform the test regularly. 36 The simplicity of the lung monitor, the minimal number of required blows and its good test performance suggests that it could be particularly useful as a screening test in patients with poor coordination or lower cognitive ability. Furthermore, our Patient Advisory Group preferred the lung monitor over other microspirometer models suggesting that it may be more acceptable to patients.
While we have suggested optimal cut-points based on the balance of sensitivity and specificity, in practice, the optimal cutpoint would be determined by the clinical setting in which the lung monitor was being used. For example, in settings where access to quality confirmatory spirometry may not be available, particularly in low-resource settings, specificity of the lung monitor may be prioritised. In these settings, using thresholds with higher specificity could effectively exclude the majority of those with respiratory symptoms who do not have COPD, thus preventing overdiagnosis.
In addition to use as a screening tool, the accurate measurement of FEV 1 may indicate that the device could be used to monitor obstruction severity or lung function decline among diagnosed COPD patients, for example during annual reviews. Further research would be needed to explore this, but the potential time and cost savings afforded by using the lung monitor instead of confirmatory spirometry may be attractive to General Practitioner practices, who would still obtain annual FEV 1 values as recommended by bodies such as the National Institute for Health and Care Excellence in the UK. 28 Future research could build on preliminary evidence regarding microspirometer screening strategies, 13,14 which could be implemented in differing clinical or economic contexts. Using a combination of microspirometry and screening questionnaires for example may prove more efficient than microspirometry alone. Furthermore, rather than using one cut-point to identify patients requiring confirmatory spirometry, certain contexts may warrant using two cut-points to refer only those patients where there is A.P. Dickens et al. uncertainty about their diagnosis. For example, in low-and middle-income countries where availability of confirmatory spirometry may be limited, a three-tiered approach may be plausible whereby the top proportion of patients are defined test negative, the bottom proportion are defined as test positive and the middle proportion are referred for confirmatory spirometry.
Our results show that the Vitalograph lung monitor, which is a cheap and simple device, has acceptable accuracy for use within a screening pathway for undiagnosed COPD among primary care patients with respiratory symptoms. We have established that the test performance of the lung monitor is unaffected by bronchodilation, and our optimum cut-point of FEV 1 /FVC < 0.78 supports previous studies, with no observed advantage of using LLN for this screening test. Our paper makes a valuable contribution to the evidence base concerning potential COPD screening tests, though more work is required to inform the need for a formal screening programme.

Study design
We conducted a prospective case-control study to evaluate the screening performance of the Vitalograph® lung monitor (Vitalograph Ltd, Buckingham, UK), nested within a large COPD Cohort study.

Participant recruitment
Study participants were drawn from those attending for their 3-year follow-up assessment as part of the Birmingham COPD Cohort Study, which has been reported in detail elsewhere. 37 In brief, participants were primary care patients aged ≥40 years, who either had previously clinically diagnosed COPD or had reported chronic respiratory symptoms as part of a case-finding trial. 38 Participants from the case-finding trial were invited to join the Cohort study, irrespective of their spirometry results, if they reported chronic cough or phlegm for ≥3 months for at least 2 years, wheeze in the past 12 months or dyspnoea of MRC grade ≥2.
At the 3-year follow-up assessment visit, cohort participants were invited to take part in the additional tests for this case-control study (Fig. 2) and those who agreed were asked to sign a consent form. Those who declined to participate completed the standard Cohort assessment. The National Research Ethics Service Committee West Midlands, Solihull provided approval for both the Birmingham COPD Cohort (11/WM/0304) and the case-finding trial (11/WM/0403).

Data collection and clinical measures
In addition to the lung function tests described below, participants underwent the standard Cohort follow-up assessment, which included various physiological and anthropometric measurements (height, weight, grip strength, exercise capacity) as well as completing questionnaires.
Index test: lung monitor microspirometry Participants received pre-and post-bronchodilator microspirometry with the Vitalograph lung monitor prior to confirmatory post-bronchodilator spirometry (Fig. 1). The lung monitor measured FEV 1 and FEV 6 in litres. In contrast to confirmatory spirometry, participants received minimal explanation or coaching when using the lung monitor. Researchers told participants to take a deep breath until lungs were full and blow into the mouthpiece as hard and fast as they could until being told to stop. Researchers demonstrated the correct technique once and then allowed the participant to perform the blows themselves, without additional coaching or encouragement. Participants performed three blows prebronchodilator and three blows post-bronchodilator. Technically unsatisfactory blows identified by the in-built quality assessment were recorded on the case report form, but participants were not asked to repeat the blow. The best FEV 1 and FEV 6 blows were used for analyses, irrespective of quality and which blow attempt they came from.
Positive test results were defined as being below the 5th percentile of the predicted pre-bronchodilator FEV 1 /FEV 6 ratio (i.e. the LLN) using the NHANES III equations. 39 Alternative positive test results were also prespecified, including post-bronchodilator FEV 1 /FEV 6 below the LLN, and various cut-points of the FEV 1 /FEV 6 ratio.
Reference test: post-bronchodilator confirmatory spirometry Post-bronchodilator confirmatory spirometry was conducted according to American Thoracic Society and European Respiratory Society 2005 guidelines 40 by trained researchers using the ndd Easy on-PC spirometer. Participants received 400 μg of Salbutamol and after waiting at least 20 min, performed a minimum of 3 (maximum of 6) blows until repeatability was achieved. Although the lung monitor and spirometry tests were administered by the same researcher, the tests were in effect administered blind of each other, as researchers did not record the FEV 1 / FEV 6 ratio for the lung monitor before administering confirmatory spirometry.
Cases were defined as participants whose predicted FEV 1 /FVC ratio was below the LLN using the NHANES III equations, according to confirmatory spirometry. Participants not meeting this criterion formed the controls.

Aims
The primary aim was to assess the pre-bronchodilator test accuracy (sensitivity and specificity) of the lung monitor (FEV 1 /FEV 6 ) against postbronchodilator confirmatory spirometry (FEV 1 /FVC), using the LLN definition of airflow obstruction.
We also aimed to assess the correlation and agreement between lung function measures from both devices and to compare test accuracy of pre-  Fig. 2 Case-control study design.
and post-bronchodilator lung monitor data. Finally, to identify the threshold that optimised sensitivity and specificity, we explored the effect of using different FEV 1 /FEV 6 thresholds, including the fixed ratio of <0.7, to define a positive test result on the lung monitor.

Sample size
We calculated that we required a sample size of 248 cases and 248 controls to detect an assumed sensitivity of 85% 21,27 while ensuring the lower bound of the CI was >80%.

Statistical analysis
We evaluated the diagnostic test accuracy of the lung monitor (index test) for all participants with complete data for the index and reference tests. We estimated sensitivity and specificity of the lung monitor using prebronchodilator data. We compared test accuracy of pre-and postbronchodilator lung monitor blows, using McNemar's test. Using continuous test values, we assessed the discriminatory accuracy of FEV 1 and FEV 6 measured by the lung monitor via receiver operating characteristic curve analysis. We then conducted sensitivity analyses using a fixed ratio definition of obstruction to identify a lung monitor FEV 1 /FEV 6 optimal threshold. To account for the case-control study design, post-test probabilities (herein referred to as positive predictive values (PPV)) were calculated using Bayes' Theorem to reflect current COPD prevalence in the UK. For our tables and appendices, we calculated PPVs based on COPD prevalence among adults aged ≥40 years being 3-10%. 1,41 All analyses were conducted in Stata SE v15. The paper was written according to the STARD guidance 42 for reporting studies of diagnostic accuracy.

DATA AVAILABILITY
The data that support the findings of this study are available on request from the corresponding authors (A.P.D. or P.A.).