Automated chest-radiography as a triage for Xpert testing in resource-constrained settings: a prospective study of diagnostic accuracy and costs

Molecular tests hold great potential for tuberculosis (TB) diagnosis, but are costly, time consuming, and HIV-infected patients are often sputum scarce. Therefore, alternative approaches are needed. We evaluated automated digital chest radiography (ACR) as a rapid and cheap pre-screen test prior to Xpert MTB/RIF (Xpert). 388 suspected TB subjects underwent chest radiography, Xpert and sputum culture testing. Radiographs were analysed by computer software (CAD4TB) and specialist readers, and abnormality scores were allocated. A triage algorithm was simulated in which subjects with a score above a threshold underwent Xpert. We computed sensitivity, specificity, cost per screened subject (CSS), cost per notified TB case (CNTBC) and throughput for different diagnostic thresholds. 18.3% of subjects had culture positive TB. For Xpert alone, sensitivity was 78.9%, specificity 98.1%, CSS $13.09 and CNTBC $90.70. In a pre-screening setting where 40% of subjects would undergo Xpert, CSS decreased to $6.72 and CNTBC to $54.34, with eight TB cases missed and throughput increased from 45 to 113 patients/day. Specialists, on average, read 57% of radiographs as abnormal, reducing CSS ($8.95) and CNTBC ($64.84). ACR pre-screening could substantially reduce costs, and increase daily throughput with few TB cases missed. These data inform public health policy in resource-constrained settings.

this is problematic as many HIV-infected patients are unable to produce sputum [14][15][16] . Obtaining sputum from high numbers of subjects is time consuming and logistically challenging, and the majority of TB prevalence surveys to date pre-screen participants with symptoms and chest radiography (WHO Strategy 3) 13,17 . Thirdly, the Xpert test involves two hours of processing time, restricting daily throughput. Thus, high cost and long processing intervals limit the value and widespread uptake of Xpert as a point-of-care test. Other widely used tools for the diagnosis of active TB, which are older but reliant, are sputum smear microscopy and culture. The former is relatively cheap ($2.10 -$4.60) 6,9,11 but has limited performance 18 . The latter is rather expensive ($5 -$15) 6,9 and has a processing time of multiple weeks. Thus, both tests are suboptimal for the detection of TB in point-of-care settings. Alternative algorithms or tests are urgently needed.
Pre-screening tests have the potential to improve throughput if they are quicker to perform than Xpert, and should also reduce cost 19 . In this study, we investigated a novel pre-screening test, namely automated chest radiography (ACR), which consists of digital chest radiography assisted by automated interpretation by computer software. This test is rapid: it requires only correct positioning of the patient to acquire a radiographic image. The image is subsequently processed within 1 minute by a standard computer software program. The advantages of automatically analysing radiographs with computer software are the need of only a radiographer, and no on-site clinical officer. The latter might not always be available and is expensive. ACR is therefore not labour intensive and is faster and potentially cheaper than conventional CXR reading. A recent study by Maduskar et al. 20 showed that sensitivity and specificity of computerized reading was not significantly different from reading by clinical officers.
In this study, we evaluate ACR as a pre-screening tool prior to molecular (Xpert) testing. We propose a diagnostic algorithm that is tested on data from a passive case-finding study, but could also be employed in other settings. The performance of this diagnostic algorithm is reported in terms of sensitivity, specificity, throughput and costs, and is compared with a scenario where Xpert is used for all patients and an alternate scenario in which the radiographs are read by humans.

Methods
Data. For this study, data from the TB-NEAT collaborative study was used 21 . All patients enrolled in this study were self-referred TB suspects presenting at a busy urban clinic in Cape Town, South Africa. All subjects provided sputum, underwent digital chest radiography on an Odelca DR unit (Delft Imaging Systems, Veenendaal, The Netherlands) and were offered voluntary testing and counselling for HIV. In the parent study, conventional radiography was used and liquid culture, and Xpert testing was performed by a trained laboratory technician in a centralized reference laboratory. Full study details have been reported elsewhere 21 . The study was carried out according to the protocol approved by the University of Cape Town Faculty of Health Sciences Research Ethics Committee (#404/2010). All patients provided written informed consent for study participation. Of the 419 recruited patients, 6 patients had missing radiographs, 22 patients had no conclusive Xpert or culture results and the software was unable to ascertain a diagnosis for 3 cases. Thus, a total of 388 patients had all information available. Culture positivity for Mycobacterium tuberculosis served as the reference standard. All cases were scored, blinded to any clinical information, on an integer scale from 0 to 100 by two CRRS certified "B" readers 22,23 and an experienced radiologist (reader 3) for the presence of abnormalities consistent with active TB, with a score >50 considered suspicious for active TB.
Diagnostic algorithm. All chest X-rays (CXRs) were retrospectively processed by the CAD4TB v3.07 (Diagnostic Image Analysis Group, Radboud university medical center, Nijmegen, The Netherlands) computer software. This software produces a continuous image abnormality score between 0 (normal) and 100 (highly abnormal). An example can be found in Fig. 1. Using various thresholds (T) on this score, we simulated the effect of using the software as a pre-screening test for molecular testing. In this simulated diagnostic algorithm, subjects with a score smaller than or equal to T were regarded as TB-negative, and these subjects would not undergo additional testing; otherwise, Xpert test results were used. The algorithm is schematically depicted in Fig. 2.
Hypothetical point-of-care testing unit. For this study, we assumed a hypothetical setting where a mobile TB unit is used for passive case finding and point-of-care testing. This hypothetical unit would have one digital radiography system and three 4-cartridge GeneXpert IV machines, resulting in a capacity of 300 ACRs per day, and an Xpert testing capacity of 45 tests per day. Subjects selected for Xpert testing by the ACR pre-screening would be referred directly for Xpert testing in the same unit.
Cost analysis. Cost analysis was limited to the point-of-care-associated costs of diagnosis. We assumed an Xpert price of $13.06, based on the FIND subsidized price for Xpert cartridges 24 . The price of ACR is determined largely by the cost of the radiography unit and the throughput. We assume a cost of $1.46 for ACR 9,25 . Furthermore, no increased costs were assumed for manual CXR reading. Detailed cost calculations for Xpert and ACR can be found in Table S1 and S2 of the supplementary material. Performance analysis. Diagnostic performance was assessed by computing sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) at each threshold. With the assumed costs and capacity, the cost per screened subject (CSS), the cost per notified TB case (CNTBC) as well as the throughput in cases/day was calculated. The performance of ACR was compared to CXR pre-screening with specialist readers. In this scenario, the ACR score was replaced by the composite score of specialist readings, and different thresholds (R) were simulated and performance was computed. We also evaluated scenarios where all HIV-infected patients would undergo Xpert testing, without receiving pre-screening. Sensitivity changes were tested for significance using the McNemar χ 2 test (IBM SPSS Statistics 20), considering p < 0.05 significant.

Results
The dataset contained 388 cases, of which 128 (33.0%) subjects were HIV-positive, 71 (18.3%) subjects were culture positive, 62 (16.0%) subjects were Xpert positive and 56 (14.4%) subjects were both culture and Xpert positive. These results are listed in Table 1. The performance, in terms of area under ROC curve, of the CAD4TB software and standalone specialist readers is comparable and not significantly different, and is shown in the supplementary Figure S1. Table 2 shows the performance of the ACR system at different thresholds. Thresholds were chosen based on the percentage selected for downstream Xpert testing shown in the second column. The first row matches the scenario where ACR is omitted and every subject receives an Xpert test. The Xpert baseline performance in terms of sensitivity and specificity was 78.9% and 98.1%, respectively, resulting in a CSS and CNTBC of $13.09, and $90.70, respectively. Significant changes in sensitivity are marked with an asterisk (*). The performance of the diagnostic algorithm combined with human reading is shown in Table 3.
Sensitivity and specificity. The maximum achievable sensitivity with this algorithm is that of the Xpert test. With increasing ACR score thresholds, more subjects (including TB-positives) were excluded for Xpert testing, which by definition decreases sensitivity and increases specificity. The changes in sensitivity and specificity were moderate for scenarios up to T = 85, where 40% of subjects undergo Xpert   Throughput. Given the Xpert and ACR capacities, the daily throughput was calculated for each threshold T. The throughput started at 45 for the Xpert-only scenario, and increased to 150 for pre-screening with T = 94. The maximum capacity of the digital radiography unit was not reached due to the limited Xpert capacity.

Performance analysis stratified by HIV status. The analysis stratified by HIV status, is shown
in Tables 4 and 5. Comparing the tables, it can be seen that the performance of Xpert was significantly better in HIV-uninfected patients than in HIV-infected patients. The HIV-uninfected group showed a sensitivity of 93.9% and specificity of 99.1% for Xpert alone, and changed to 87.9% and 100%, respectively, with higher thresholds (T = 85). The CSS and CNTBC decreased 49% and 45%, respectively. The sensitivity and specificity of Xpert for HIV-infected subjects was 65.8% and 95.6%, respectively. For higher thresholds (T = 85), sensitivity dropped to 50.0% and specificity increased to 98.9% (see Table 5). The CSS and CNTBC decreased by 49% and 33%, respectively. Table 6 shows the results for a scenario where all HIV-infected and HIV-unknown subjects get Xpert testing and HIV-uninfected patients get ACR pre-screening. For thresholds up to 50% Xpert (T = 95), the performance is better compared to a scenario where all patients get pre-screening. For example, at T = 84 (60% Xpert), the sensitivity decreased non-significantly to 76.1% and specificity increased to 98.7%, while the CSS decreased to $8.81 (33% decrease), CNTBC to $63.30 (30% decrease) and throughput increased to 75. Specialist reading. The results of specialist reading are shown in Table 3. Based on R = 50, the specialist reader scores associated with suspicion for active TB, three readers committed 63%, 56% and 53%, respectively, of the cases for Xpert testing. The sensitivity for reader 1 was slightly higher than for reader 2 and 3: 76.1% versus 74.6%, but came with a slightly decreased specificity: 98.4% versus 99.1%. The PPV for readers 2 and 3 is better than for reader 1: 94.6% versus 91.5%. The NPV was similar for all readers, with 94.8% and 94.6%. As reader 2 and 3 sent fewer patients for Xpert testing, the CSS are slightly lower, $8.75 and $8.43 versus $9.66, respectively, as are the CNTBC: $64.04 and $61.07 versus $69.40, respectively. Comparing these numbers with the ACR threshold T = 60 (see Table 2), which also processed roughly 60% of the patients for Xpert, the specialist readers perform marginally better. As the   radiographs were scored on a scale from 0 to 100, higher thresholds could also be used. For the scenario where roughly 30% of the subjects receive Xpert testing, the automated algorithm outperforms reader 1, shows similar results for reader 3; however it is inferior to reader 2 (see Table 2).

Discussion
Following WHO recommendations for molecular based testing in 2010, the procurement of the GeneXpert MTB/RIF systems in resource-constrained countries is high and is still expanding. However, available funds are often limited, and the WHO has also highlighted the need for cost-saving diagnostic pathways. Our results have shown that ACR pre-screening may offer a solution: being cheaper, faster, with only a moderate decrease in sensitivity, and the benefit of increased throughput, as compared with Xpert testing. Using the ACR pre-screening algorithm at T = 85, only 40% of the patients would be sent for a downstream Xpert test, and the CSS and CNTBC were less than 51% and 60% of the original cost, respectively. In this study, the benefit of ACR came at the cost of eight additionally missed TB cases. This can be attributed to the limitations of radiography screening itself in addition to the use of automated pre-screening. Of these eight cases, six were HIV-infected, and it has been previously reported that chest radiography is less sensitive in immunocompromised patients 26,27 . This effect was also seen in the results by HIV stratification in Tables 4 and 5. The performance of the diagnostic algorithm among HIV-uninfected patients is considerably better than in HIV-infected patients, and consideration should    be applied for different thresholds for both groups. For example pre-screening might be considered only for HIV-uninfected patients ( Table 6). This would still reduce CSS and CNTBC with more than 30% (T = 84), with only a marginally non-significant reduction in sensitivity (two additionally missed TB cases). Comparing the automated system to specialist readers, the performance of the algorithm is slightly inferior with medium-range thresholds, but comparable for high thresholds, although the performance among readers differed. However, although not modelled in this study, specialist reading is more time consuming, requires training and also increases costs. The proposed pre-screening diagnostic algorithm is fully automated and requires only an on-site radiographer. The availability of automated analysis software obviates the need for a radiologist or clinical officer and limits the throughput to that of the radiography unit only. In our simulations, we kept the cost of ACR and manual reading constant at $1.46, and this amount did not change with altered throughput nor did it affect the main outcomes of the study.
Besides cost-reduction, pre-screening can substantially increase throughput, although it would remain limited by the Xpert machine's capacity. This may be particularly useful in screening settings, as sputum collection would not be needed for many potential TB subjects. Additionally, given that the average test duration is reduced with most patients receiving a negative radiology test result within a few minutes, this would obviate a 2-hour wait for Xpert results.
We predict that, in a typical screening setting, pre-screening with ACR could increase patient throughput by up to 250%. Another advantage of automated reading is that different thresholds with objective and reproducible results could be chosen. Thus, according to the setting used, a threshold could be chosen to obtain the desired sensitivity, specificity or throughput. This makes ACR highly valuable in resource-constrained settings, and this proposed diagnostic pathway can save both cost and time in a point-of-care setting. We hypothesize that the algorithm's value is even higher in active case-finding scenarios and sputum-scarce cohorts, but future studies are needed.
Compared to a recent study by Muyoyeta et al. 28 , which uses a previous version of the CAD4TB software on presumptive TB patients with Xpert as reference, the current software showed improved performance and supported their findings that ACR has a potential role in TB screening. Additionally, the current software showed better performance values than reported in a previous study by Maduskar et al. 20 .
The proposed diagnostic algorithm does not account for patients with high ACR scores but negative Xpert results: it is likely that these patients either have a false negative Xpert test or may have disease other than TB. Furthermore, current costs analysis was limited to the costs of diagnosis only and does not take into account the cost of treatment and misdiagnosed patients. This group of patients should be flagged to receive further attention based on their ACR scores. In general, basic automated screening for abnormalities could prompt referral mechanisms for higher levels of care.
In conclusion, ACR pre-screening before sputum Xpert molecular testing is a promising new advance in TB diagnostics. The ACR algorithm can significantly reduce cost and substantially increase throughput, while maintaining high sensitivity. This makes it a potentially highly valuable diagnostic tool in resource-constrained countries.