Predictors for bronchoalveolar lavage recovery failure in diffuse parenchymal lung disease

Bronchoalveolar lavage (BAL) plays a role in the diagnosis of diffuse parenchymal lung diseases (DPLD); however, poor BAL fluid (BALF) recovery results in low diagnostic reliability. BAL is relatively safe, but its indications should be carefully considered in patients with risks. Therefore, estimating the likelihood of recovery failure is helpful in clinical practice. This study aimed to clarify predictors of BALF recovery failure and to develop its simple-to-use prediction models. We detected the predictors applying a logistic regression model on clinical, physiological, and radiological data from 401 patients with DPLD (derivation cohort). The discrimination performance of the prediction models using these factors was evaluated by the c-index. In the derivation cohort, being a man, the forced expiratory volume in one second/forced vital capacity, and a BAL target site other than right middle lobe or left lingula were independent predictors. The c-indices of models 1 and 2 that we developed were 0.707 and 0.689, respectively. In a separate cohort of 234 patients (validation cohort), the c-indices of the models were 0.689 and 0.670, respectively. In conclusion, we developed and successfully validated simple-to-use prediction models useful for pulmonologists considering BAL indications or target sites, based on independent predictors for BALF recovery failure.


Methods
Subjects. We retrospectively reviewed records of 605 consecutive patients with DPLD who had undergone elective BAL between October 2013 and September 2018 at the Hamamatsu University Hospital. These patients had been diagnosed as having DPLD on the basis of diagnostic procedure results (including those of BAL) related to guidelines or statements [3][4][5][12][13][14][15][16][17][18][19][20] . Figure 1 presents the study flow chart. We excluded data from 75 patients whose pulmonary function test results were unavailable within 6 weeks prior to BAL. Subsequently, we excluded data from 129 patients whose HRCT data were lacking within 6 weeks prior to BAL. Consequently, we analyzed data from 401 patients (derivation cohort) for identification of recovery failure predictors and developed BALF recovery failure prediction score models based on the predictors. Furthermore, to validate the performance of the models, we extracted the data of 234 consecutive patients with DPLD and who had undergone elective BAL between October 2018 and September 2020 at the same hospital (validation cohort). In all patients in this validation cohort, pulmonary function test results and HRCT data within 6 weeks before BAL were available. We conducted the study in accordance with the tenets of the Declaration of Helsinki. The Institutional Review Board of the Hamamatsu University School of Medicine approved this study (approval number  and waived the need for patient approval or informed consents due to the retrospective nature of the study.

Data collection.
We collected data on the following variables: clinical data, including age, gender, and smoking history (including pack-years); physiological data, including forced vital capacity (FVC), percent predicted FVC (%FVC), forced expiratory volume in one second (FEV 1.0 ), and percent predicted FEV 1.0 (%FEV 1.0 ); HRCT data; BAL target site and recovery rate; and diagnosis after BAL (disease category or disease).
BAL procedure. Well-trained pulmonologists with 8 years or more of experience performed bronchoscopies with BAL, on the basis of the official guidelines 1,7 . Briefly, before the bronchoscopy, the patients inhaled a lidocaine solution through a nebulizer, and got pre-medication consisting of midazolam and pentazocine intravenously administered. The pulmonologists inserted the fiberoptic bronchoscope with 5.9 mm of a distal end outer diameter and 3.0 mm of a channel inner diameter (BF-1TQ290, Olympus, Japan) transorally. During the examination, lidocaine solution was instilled through the instrumentation channel of the bronchoscope for additional local anesthesia. To perform BAL, the pulmonologists placed the tip of the bronchoscope into a wedge position within the selected bronchopulmonary segment (BAL target site), chosen on the basis of an HRCT taken within 6 weeks prior to bronchoscopy. The position of each patient was determined according to BAL target sites; supine position for right middle lobe (RM) or left lingula (LL) targets, left lateral decubitus position for right superior lobe (RS) or right inferior lobe (RI) targets, and right lateral decubitus position for left superior lobe other than LL (LS) or left inferior lobe (LI) targets. A total volume of normal saline of 150 mL (3 aliquots × 50 mL each) were instilled. BALF recovery rates were calculated as the percent (%) rate of the total volume of retrieved BALF to the total instilled volume. BALF recovery failure was defined as a total volume of retrieved BALF lower than 30% of the total instilled volume or aborted BAL due to recovery of less than 5% of each instilled aliquot volume.
HRCT analysis. Chest HRCT was taken and multi-detector-row CT (MDCT) imaging was performed using a 64-slice MDCT machine (Aquilon-64; Toshiba Medical Systems, Tokyo, Japan) with the patient in the supine position at full inspiration breath hold. Using image analyzing software (SYNAPSE VINCENT; Fuji Film, Tokyo,  www.nature.com/scientificreports/ Japan), we obtained lung volumes and percentages of low attenuation areas (%LAA) in the lung using threedimensional CT images that were reconstructed from MDCT data. We defined %LAA, a surrogate measurement of emphysema, as the percentage of area below − 950 HU in the total lung area. We used lung volumes and %LAA in the side of BAL target sites for our analyses.

Statistical analysis.
We expressed all values as medians [interquartile ranges (IQRs)] or numbers (%). We applied Fisher's exact or Chi-squared tests to compare proportions between groups, and the Mann-Whitney U-test to compare medians. We evaluated correlations between different parameters using the Spearman's correlation test. We applied logistic regression analysis to identify variables associated with BAL recovery failure and the calculated odds ratio (OR), 95% confidence interval (CI), and P values. We tested all variables identified as significant on the univariate analysis with our multivariate analysis. We performed a receiver-operating characteristic (ROC) curve analysis to identify an optimal cut-off value, chosen as the point with the highest value of sensitivity + specificity − 1. The c-index was calculated as the area under the curve (AUC) in the ROC curve. Using the independent predictors identified, we generated simple point-score models for recovery failure prediction. The discrimination performances of the models were evaluated using the c-index. We considered all P-values < 0.05 as indicating statistical significance. In multiple pairwise group comparisons, we performed the Bonferroni correction to adjust the P-value. We analyzed all data using EZR (Saitama Medical Center, Jichi Medical University, Saitama, Japan), a graphical user interface for R (The R Foundation for Statistical Computing, Vienna, Austria). BALF recovery and target sites. Figure 2a-d show BALF recovery rates and recovery failure frequencies in each target site. We aggregated data from patients who underwent BAL from RM or LL and those who did from a site other than RM or LL (RS, RI, LS, or LI) into an RM/LL group and an others group, respectively; the median recovery rate in the RM/LL group was significantly higher than that in the others group (49.3% vs. 43.3%, respectively, P = 0.04) (Fig. 2b). The recovery failure frequency in the RM/LL group was significantly lower than that in the others group (15.2% vs. 32.5%, respectively, P = 0.01) (Fig. 2d).

Study cohort characteristics.
Clinical, physiological, and radiological parameters and BALF recovery rates. Table 2 presents the correlations of clinical, physiological, and radiological parameters with recovery rates. BALF recovery rates demonstrated a very weak or weak negative correlation with age, smoking (pack-years), and BAL side lung volume, and a weak positive correlation with FEV 1.0 /FVC. Table 3 presents the results of logistic regression analysis for recovery failure. On univariate analyses, being a man (vs. a woman), having high smoking pack-years, high FVC, low FEV 1.0 /FVC, a BAL target site other than RM/LL (vs. RM/LL), and high BAL target site lung volume were associated with recovery failure. On the multivariate analysis, being a man (vs. a woman; OR 5.27, P < 0.01), having a low FEV 1.0 /FVC (OR 0.96 per 1% increase, P = 0.03), and a BAL target site other than RM/LL (vs. an RM/LL site; OR 2.78, P = 0.01) were independent predictors for recovery failure. Table 4 presents the results of logistic regression analysis for recovery failure in the RM/LL group. On univariate analyses, gender (men), high number of smoking pack-years, low FEV 1.0 /FVC, and high BAL target site lung volume were associated with recovery failure. On the multivariate analysis, gender (men vs. women; OR 3.87, P < 0.01) and low FEV 1.0 /FVC (OR 0.97 per 1% increase, P = 0.04) were independent predictors for recovery failure. The Supplementary Table S1 presents the comparison of diagnoses after BAL between men and women in the derivation cohort. The proportion of patients with idiopathic pulmonary fibrosis (IPF) was significantly higher in men than in women. Meanwhile, the proportions of those with connective tissue disease-associated ILD (CTD-ILD) and those with sarcoidosis were significantly lower in men than in women. The Supplementary Table S2 presents the results of the logistic regression analysis for recovery failure with adjustment for diagnosis after BAL. These disease-adjusted multivariate analyses also identified being a man (vs. a woman), and having low FEV 1.0 /FVC, and a BAL target site other than RM/LL as independent predictors for recovery failure, irrespective of background disease/disease categories.

Predictors for BALF recovery failure.
We identified the optimal cut-off value of FEV 1.0 /FVC for predicting recovery failure in the derivation cohort using a ROC curve analysis ( Supplementary Fig. S1). The c-index was 0.60 (95% CI 0.520-0.676). Using 74.4% as the cut-off value of FEV 1.0 /FVC, the sensitivity and specificity were 80.0% and 36.9%, respectively. Development of the BALF recovery failure prediction score model. Using the independent predictors identified, including gender, FEV 1.0 /FVC, and BAL target site, we attempted to develop a simple pointscore model for recovery failure prediction. We determined an FEV 1.0 /FVC cut-off value at 74% based on the result of the ROC analysis that was performed earlier in the derivation cohort. In this cohort, the recovery failure frequency in patients who showed an FEV 1.0 /FVC < 74% was significantly higher than that in those who with an  Fig. 3a). The recovery failure frequency in men was significantly higher than that in women (23.6% vs. 4.9%; P < 0.0001; Fig. 3b). The recovery failure frequency in patients who underwent BAL in a target site other than RM/LL was significantly higher than that in those who had an RM/ LL target site (32.5% vs. 15.2%; P < 0.01). We assigned 1 point to each predictor, and categorized patients of the derivation cohort into four groups based on their total point scores (0-3) (model 1). Figure 3c presents the model 1 performance. The recovery failure frequencies for the prediction score groups (0, 1, 2, and 3) were 3.6%, 16.2%, 30.9%, and 80.0%, respectively (P < 0.0001). The c-index of this model was 0.707 (95% CI 0.648-0.766) ( Supplementary Fig. S2a). In a similar manner, we assigned 1 point for having low FEV 1.0 /FVC and for being a man, and we categorized the patients of the derivation cohort into three groups based on their total point scores (0-2) (model 2, Fig. 3d). The recovery failure frequencies in the recovery failure prediction score groups (0, 1, and 2) were 4.2%, 18.2%, and 34.3%, respectively (P < 0.0001). The c-index of this model was 0.689 (95% CI 0.631-0.746) ( Supplementary Fig. S2b).

Discussion
Our multivariate logistic regression analysis revealed that being a man (vs. a woman), having low FEV 1.0 /FVC, and a BAL target site other than RM/LL (vs. RM/LL) were independently associated with a higher frequency of BALF recovery failure. Using these independent predictive factors, we built BALF recovery failure prediction score models that are simple to use for risk determination. We successfully validated our prediction score models, based on the comparable discrimination between two separate cohorts. To our knowledge, this is the first and largest study identifying independent predictors of BALF recovery failure based on clinical, physiological, and radiological data in patients with DPLD, and the first study to propose simple-to-use recovery failure prediction score models. Retrospective studies on BAL recovery exist, Schildge et al. and Karimi et al. demonstrated that BALF recovery rates were weakly correlated with age, smoking history (pack-years), and FEV 1.0 /FVC, although these were based on only bivariate analyses 10,21 . A reduced compliance in the lung parenchyma caused by aging or smoking may easily induce a collapse of the airway during BAL 21 . Our results are consistent with those findings. Furthermore, we identified the independent predictors of recovery failure based on multivariate logistic analyses in a large cohort of patients with DPLD, which is an advantage of this study.
We also found that being a man (vs. a woman) was an independent predictor of recovery failure, regardless of adjustments for smoking history, pulmonary function test results, BAL target site, and lung volume. Although the proportions of patients with CTD-ILD, those with IPF, and those with sarcoidosis were different between men and women in this study, our multivariate analyses adjusted for post-BAL diagnosis also demonstrated that being a man was an independent predictor. Therefore, it is unlikely that the background disease composition difference between men and women affected our results. Li et al. demonstrated that women have a significantly smaller bronchial lumen diameter and cross-sectional lumen area than men, irrespective of smoking status 22 . Gender differences in anatomy of the lower airway (e.g., the difference in the cross-sectional area or volume of peripheral structures beyond a wedged bronchus) may affect BAL recovery. www.nature.com/scientificreports/ A guideline on BAL recommended a BALF recovery rate of ≥ 30% to obtain an optimal alveolar sample and for safety reasons and that BAL be discontinued if the recovery volume is too low 1 . In patients with ILD and had a BALF recovery rate of ≥ 30%, Schildge et al. found no significant difference in the cell count between the higher and lower recovery rate groups 10 . This suggested that in patients with ILD, the BALF recovery rate may not have a significant impact on diagnosis if the rate is ≥ 30%. On the other hand, except for studies on infectious diseases, there is insufficient evidence on whether a BALF recovery rate of < 30% reflects the true cell count from the distal airspaces or whether it can contribute to the diagnosis of DPLD 23 . The required BALF recovery rate cutoff value for diagnosis likely varies depending on the disease. Because the present study aimed to identify the predictors of BALF recovery failure, we did not assess whether low recovery rate or failure affected cell count or diagnosis. Further studies are needed to clarify this issue.
In addition, the guideline suggested that the target site should be selected based on the HRCT rather than selecting the RM/LL 1 . However, the optimal target site varies among cases [24][25][26] , and evidence on this has not been fully established. In cases that have HRCT abnormalities at various sites, including the RM/LL, attending physicians may be unsure on the selection of a target site between sites other than the RM/LL with the most prominent abnormalities and the RM/ LL with some extent of abnormality, the latter being the traditional sites with high BAL recovery rate. In this context, by determining the risk of BAL recovery failure, our simple to use score model may serve as a guide when choosing the target site on which to perform BAL. For instance, if a man with DPLD has low FEV 1.0 /FVC, a BAL target site other than RM/LL should probably be avoided to minimize the likelihood of recovery failure. On the other hand, when there are no abnormalities in RM/LL, selection of other target sites with prominent HRCT abnormalities should be considered. However, if such cases are suspected to be at risk for potential complications or have contraindications, BAL recovery failure may only do harm not give benefit. Therefore, determining the risk of BAL recovery failure, in addition to the diagnostic yield and impact on patient management of BAL, may help determine whether BAL or an alternative test is needed. Collectively, these models can provide helpful information to select a BAL target site or to consider BAL indications for patients with risks.
In this study, we determined 74% as the FEV 1.0 /FVC cut-off value for recovery failure in our prediction model on the basis of the result of the ROC analysis regardless of the standard spirometric criterion for airflow limitation being at FEV 1.0 /FVC < 70% 27 . We also evaluated the performance of our prediction model using a cut-off value at 70%; however, the performance was comparable to that at 74%. A larger study is needed to determine the optimal FEV 1.0 /FVC cut-off value.
We are aware of the limitations in our study. First, the retrospective design of the study renders it vulnerable to several biases. For instance, because our institution is a regional referral center, selection bias in our study population is a possibility. Second, this study included patients with a variety of DPLDs. Therefore, the physiological and/or morphological differences among diseases may have affected BALF recovery rates. Third, the BALF recovery rate may have been affected by factors other than those examined in the present study; these include suction pressure, individual anatomical differences, and the diameter of the bronchial segments. Finally, BAL guidelines recommend that the total volume of normal saline instilled should be between 100 and 300 mL www.nature.com/scientificreports/ divided into 3 to 5 aliquots 1,7 , which yields some variability in the real-world BAL protocols. In our study, we consistently used 150 mL (3 aliquots of 50 mL each) for the BAL protocol. Differences in instilled volume may be related to both the recovery rate and safety of BAL. The optimal volume to be instilled would need to be established, especially in patients at risk, including those with hypoxemia. A different study should analyze associations between the total instilled volume/aliquots, recovery failure, and safety.
In conclusion, our results revealed that being a man, having a low FEV 1.0 /FVC, and a BAL target site other than RM/LL were independent predictors of BALF recovery failure in patients with DPLD, and they suggest that simple-to-use score models based on these predictors are helpful for predicting recovery failure. Our results will provide valuable information for pulmonologists choosing a BAL target site and weighing the potential benefits against the burdens of BAL procedures. A prospective, multicentre study is required to validate these results. The recovery failure frequency in patients who showed an FEV 1.0 /FVC < 74% was significantly higher than that in those with FEV 1.0 / FVC ≥ 74% (27.8% vs. 13.5%, respectively; P < 0.01). (b) The recovery failure frequency in men was significantly higher than that in women (23.6% vs. 4.9%, respectively; P < 0.0001). (c) In model 1, each predictor (being a man, FEV 1.0 /FVC < 74%, a BAL target site other than the RM/ LL) was assigned one point. The recovery failure frequencies in the model 1 prediction score groups (total scores 0, 1, 2, and 3) were 3.6%, 16.2%, 30.9%, and 80.0%, respectively (P < 0.0001; c-index 0.70). (d) In model 2, each predictor (being a man and FEV 1.0 / FVC < 74%) was assigned one point. The recovery failure frequencies in the model 2 prediction score groups (total scores 0, 1, and 2) were 4.2%, 18.2%, and 34.3%, respectively (P < 0.0001; c-index 0.69). FEV 1.0 , forced expiratory volume in one second, FVC forced vital capacity.

Data availability
The data that support the findings of this study are available from the Hamamatsu University School of Medicine but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the Hamamatsu University School of Medicine.  The recovery failure frequency is significantly higher in men than in women (22.4% vs. 7.7%, P < 0.01). (c) In model 1, each predictor (i.e., being a man, FEV 1.0 /FVC < 74%, BAL target site other than the RM/ LL) was assigned one point. The recovery failure frequencies in the prediction score groups (total scores 0, 1, 2, and 3) are 5.6%, 16.5%, 28.0%, and 100%, respectively (P < 0.0001; c-index 0.69). (d) In model 2, each predictor (i.e., being a man and FEV 1.0 /FVC < 74%) was assigned one point. The recovery failure frequencies in the prediction score groups (total scores 0, 1, and 2) are 6.6%, 17.4%, and 35.1%, respectively (P < 0.001; c-index 0.67). FEV 1.0 , forced expiratory volume in one second; FVC, forced vital capacity. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.