Differential features of chronic cough according to etiology and the simple decision tree for predicting causes

Finding etiology of chronic cough is an essential part of treatment. Although guidelines include many laboratory tests for diagnosis, these are not possible in many primary care centers. We aimed to identify the characteristics and the differences associated with its cause to develop a clinical prediction model. Adult subjects with chronic cough who completed both Korean version of the Leicester Cough Questionnaire (K-LCQ) and COugh Assessment Test (COAT) were enrolled. Clinical characteristics of each etiology were compared using features included in questionnaires. Decision tree models were built to classify the causes. A total of 246 subjects were included for analysis. Subjects with asthma including cough variant asthma (CVA) suffered from more severe cough in physical and psychological domains. Subjects with eosinophilic bronchitis (EB) presented less severe cough in physical domain. Those with gastro-esophageal reflux disease (GERD) displayed less severe cough in all 3 domains. In logistic regression, voice hoarseness was an independent feature of upper airway cough syndrome (UACS), whereas female sex, tiredness, and hypersensitivity to irritants were predictors of asthma/CVA; less hoarseness was a significant feature of EB, and feeling fed-up and hoarseness were less common characteristics of GERD. The decision tree was built to classify the causes and the accuracy was relatively high for both K-LCQ and COAT, except for UACS. Voice hoarseness, degree of tiredness, hypersensitivity to irritants and feeling fed-up are important features in determining the etiologies. The decision tree may further assists classifying the causes of chronic cough.

www.nature.com/scientificreports/ Cough is one of the most common symptoms of pulmonary and extra-pulmonary diseases, leading people to seek medical attention 1,2 . Though cough reflex is an important defense mechanism for airway clearance 3,4 , it is just regarded as an annoyance that impairs quality of life for many patients, especially if it persists chronically 2 . However, finding the etiology of chronic cough is not an easy process, as it can be developed through various reasons. Upper airway cough syndrome (UACS), asthma including cough variant asthma (CVA), eosinophilic bronchitis (EB), and gastroesophageal reflux syndrome (GERD) have been emphasized as the primary causes of chronic cough in non-smokers with normal chest radiographs [5][6][7][8][9][10][11][12][13] . Additionally, a recent study highlighted that the same causes could also be applied to smokers 14 . Based upon these causes, there have been several efforts to make an effective diagnostic algorithm. Guidelines for chronic cough usually include chest radiography, spirometry with the bronchodilator reversibility test, the bronchoprovocation test, induced sputum analysis, and gastro-endoscopy [7][8][9][10][11][12][13]15 . However, as these tests need several specialized laboratory devices and space, this may not be feasible among many primary care centers with limited resources. For these circumstances, some guidelines also recommend empirical treatment for possible causes first and subsequent re-assessment following the treatment response as an alternative method 16,17 . Therefore, it is essential to develop a simple algorithm to identify plausible etiology based on clinical characteristics, allowing for the treatment of chronic cough among primary clinics. The aim of this study was to describe the phenotypic characteristics of chronic cough, compare their differences according to its cause, and develop prediction models for etiology using two cough related quality of life questionnaires: the Korean version of the Leicester Cough Questionnaire (K-LCQ) and the Cough Assessment Test (COAT).  S3). There was no significant difference in age or cough duration before visit among these causes, but female sex was predominant in asthma/ CVA (76.9%). Subjects with asthma/CVA complained of more severe degree of cough in total scores of COAT and K-LCQ, especially physical and psychological domains. On the contrary, patients with EB had less severity in physical domain of K-LCQ, and those with GERD showed less severity in COAT and all three domains of K-LCQ. Patients with UACS and idiopathic cough demonstrated no difference in COAT and K-LCQ scores (Table 1). Sex was not associated with scores of either COAT (p = 0.14) or K-LCQ among all three domains (physical, p = 0.06; psychological, p = 0.40; social, p = 0.29; total, p = 0.19). However, as age increases, scores of COAT decrease (regression coefficient = − 0.07, p = 0.0002) and that of K-LCQ increase in both psychological (regression coefficient = 0.02, p = 0.001) and social domains (regression coefficient = 0.02, p = 0.003).

Results
Characteristic features of chronic cough according to causes. Using the K-LCQ questionnaire, subjects with UACS presented with more voice hoarseness. Subjects with asthma/CVA complained of more bothersome phlegm, tiredness, hypersensitivity to irritants, and sleep disturbance in physical domain; feeling out of control with cough and embarrassment in psychological domain; interference with job/daily task or life enjoyment in social domain. Those with EB showed less chest/stomach pain, bothersome phlegm, and hoarseness in physical domain; less worry about serious illness in psychological domain; less job/daily activity or life enjoyment interference in social domain. Subjects with GERD also presented with less phlegm, tiredness and hoarseness in physical domain; less frustration, feeling fed up, and concern about other's thoughts in psychological domain; and less interference with job/daily task or life enjoyment in social domain. Patients with multiple causes or idiopathic cough showed no clinical difference with those with single cause. Detailed scores of respective items of K-LCQ and their comparisons are summarized in Table 2. For multivariable analysis, stepwiselogistic regression was performed for each disease. Since K-LCQ measures cough-specific quality of life that higher score indicates better quality of life with lesser symptoms, therefore, odds ratios less than 1 represent correlation with severe symptoms with lesser scores for QOL; odds ratios more than 1 relate to lesser symptoms with higher scores for QOL. In UACS, more hoarseness of voice was chosen (OR 0.76). Female sex (OR 2.16), more tiredness (OR 0.79), and hypersensitivity to irritants (OR 0.82) were significantly associated with asthma/ CVA. For EB, less voice hoarseness (OR 1.59), and for GERD, less feelings of being fed-up (OR 1.35) as well as less voice hoarseness (OR 1.42) were selected. Multivariable model for idiopathic cough was not able to be built.
Results of the multivariable analysis are summarized at Table 3. The AUC of ROC curve for classification of UACS, asthma/CVA, EB, and GERD were 0.60, 0.71, 0.70, and 0.77, respectively (Fig. 1). The model of logistic regression was validated using LOOCV and predictive validity was 0.76, 0.82, 0.87, and 0.89 for UACS, asthma/ CVA, EB, and GERD. Subsequently, detailed scores of each item of COAT questionnaire are described in Supplemental Table S1. In addition, radar charts comparing the patterns of COAT for each cause are drawn in Fig. 2 Decision tree predicting the causes of chronic cough. A decision tree was constructed to examine the non-parametric model. Initially, an integrated decision tree to diagnose all causes at once was built using items of K-LCQ in addition to age, sex, and current smoking status; LCQ items 2 (phlegm), 3 (tiredness), 5 (embarrassment), 9 (hypersensitivity to irritants), 10 (sleep disturbance), 15 (loss of energy), and 17 (concern for others), age, and sex were selected (Supplemental Fig. S4A). However, the accuracy of this decision tree was only 0.50. When we modeled another decision tree using the components of COAT, factors 2 (daily activity limitation), 3 (sleep disturbance), 4 (fatigue), age, sex, and current smoking status were selected (Supplemental Fig. S4B). Nevertheless, the accuracy of this decision tree was also low with 0.49. Therefore, a specific decision tree for each cause of chronic cough was re-constructed using K-LCQ or COAT score. For UACS, a decision tree using K-LCQ selects item 2 (phlegm) and 14 (hoarseness), and accuracy of the tree using these 2 items was 0.60 (Supplemental Fig. S5A). Using COAT items, item 2 (daily activity limitation), 4 (fatigue), age and current smoking were chosen, and accuracy was 0.64 (Supplemental Fig. S6). In case of asthma/CVA, a decision tree using K-LCQ selected item 1 (chest/stomach pain), 3 (tiredness), 5 (embarrassment), 15 (loss of energy), 16 (worries about serious illness) and age, and the accuracy was 0.80 (Supplemental Fig. S5B). Using COAT, factor 3 (sleep disturbance), 4 (fatigue), age and sex were selected (Fig. 3A), and accuracy was 0.76. In EB, K-LCQ item 1 (chest/ stomach pain), 2 (phlegm), 3 (tiredness), 7 (job/activity interference), 9 (hypersensitivity to irritants), 10 (sleep disturbance), 14 (voice hoarseness) and 19 (annoyance to partner/friend/family) were selected for K-LCQ tree (Supplemental Fig. S5C), and accuracy was 0.88. For COAT tree, factor 1 (cough frequency), 2 (daily activity limitation) and age were selected (Fig. 3B), and accuracy was 0.83. In GERD, K-LCQ item 1 (chest/stomach pain), 5 (embarrassment), 9 (hypersensitivity to irritants), 13 (feeling fed-up), 19 (annoyance to partner/friend/ family) and current smoking status were selected for K-LCQ tree (Supplemental Fig. S5D), and COAT factor 2 (daily activity limitation), 3 (sleep disturbance) and age were selected for COAT tree (Fig. 3C); their accuracy was 0.89 and 0.85, respectively.

Discussion
In this study, we characterized and compared the features of chronic cough according to underlying causes using commonly available cough questionnaires. Patients with asthma/CVA suffered from more severe cough, especially in physical and psychological domain. Meanwhile, patients with GERD showed less severity in all 3 domains. Despite the different severities of cough in each cause, the time to visit a hospital was similar. Remarkably, voice hoarseness was an independent feature of UACS, supposed to be caused by post-nasal drip or inflammation on larynx located nearby. Female sex and hypersensitivity to irritants were predictors for asthma/CVA, consistent with previous description 18,19 . Less hoarseness was characteristics of EB and feeling fed-up and hoarseness were less common features of GERD; less severe cough may less likely induce secondary laryngitis. Based on the differential manifestation according to etiology, we built a practical decision trees to predict the causes. Moreover, the classification tree using COAT-simplified version questionnaire-showed similar accuracy to that of K-LCQ, which suggests that this COAT algorithms could be easily applied to everyday clinical practice.
Most of the previous studies on chronic cough have been focused on prevalence, identifying common causes, or development of effective diagnostic flow for laboratory tests [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17] . However, a description of the detailed features and comparison of characteristics to distinguish each etiology had been limited, which is a fundamental step  www.nature.com/scientificreports/ in medical practice. Our study revealed the characteristic differences of chronic cough by each etiology. One of interesting findings is the differences between asthma/CVA and EB. Although they share immunopathology of eosinophilic airway inflammation 20 , the detailed features between them have not been characterized. Patients of chronic cough with EB were not female dominant; less severe features; and specific differences of each item in the cough questionnaire were observed, unlike those with asthma/CVA. Furthermore, present study attempted to make a simple classification algorithm to decide which test should proceed further using clinical features. The decision tree enables physicians to classify causes very fast that can serve as a useful tool in clinical practice. Also, decision trees may provide clues for the pathogenesis of disease due to their own structure. Initially, building a single decision tree for every cause was tried, but accuracy of this classification was unsatisfactory. This supposed to be from multiple causes accompanied by, therefore, separate decision trees for each cause were re-constructed. These decision trees, specific to each cause, produced higher accuracy except for UACS; which suggested that the reason for low accuracy at initial single tree might be attributable to UACS. Since a simple physical examination could enhance the diagnostic accuracy of UACS, further large studies including this information are needed. Several limitations should be addressed. Detailed results of physical examination, spirometry, bronchoprovocation test, eosinophil count in induced sputum, or exhaled fraction of nitric oxide were not reviewed in this analysis. Therefore, there could be some possibilities of misdiagnosis that underestimate the accuracy of our model. Nevertheless, the clinical diagnosis was decided by pulmonary specialists in respiratory centers, considering their available facilities. Validation of the COAT is tested only in a single country 21 and would need to be further generalized. Though we tried to use cross-validation in our analysis, we did not have external validation set to test our model. Since prevalence of chronic cough could be different among different countries, predictive     www.nature.com/scientificreports/ particularly in primary care setting. Therefore, ascertainment of this cohort may limit generalizability to the other races. Further large studies to confirm our findings are needed, especially from different countries. Lastly, information about comorbidities could have enhance the understanding of their relation to symptoms and pathophysiology.
In conclusion, the degree of tiredness, hypersensitivity to irritants, feeling fed-up, voice hoarseness, and sex are important features in determining etiologies of chronic cough, and the simplified COAT questionnaire can be used to distinguish causes as well as measurement of cough severity. Further large studies to confirm our findings are necessary.

Methods and materials
Study subjects. Adult patients (≥ 18 years old) with chronic cough lasting more than 8 weeks were recruited from 16 respiratory centers in Korea from March 1, 2016 to February 28, 2018. All the possible candidates were enrolled during these periods. The possible cause of chronic cough was assessed via the diagnostic flow of Korean cough guideline by pulmonary specialists in each hospital, excluding those with suspected abnormalities on chest radiography 15 . Enrolled participants completed both the Korean version of the Leicester Cough Questionnaire (K-LCQ) and the COugh Assessment Test (COAT) (Supplemental Fig. S1). The K-LCQ is a validated cough-specific quality of life (QOL) questionnaire containing 19 items divided into 3 domains: physical, psychological, and social 22,23 . A 7-point Likert scale is used to evaluate the responses for each item, and the total scores are calculated by summation of the mean converted values of each domain, which range from 3 to 21; higher score indicates better quality of life. Physical domain section includes questions about chest/stomach pain, accompany of bothersome phlegm, tiredness, hypersensitivity to irritants, sleep difficulties, frequency of coughing bouts, presence of voice hoarseness, and loss of energy due to cough. In psychological domain, questions of feeling fed-up, worrying about serious illness, and concerns of what other people might think are included. Social domain contains questions of interference with job or daily tasks, life enjoyment, interruption of telephone call conversation, and annoyance of partner, family, or friend. The COAT is a simplified version of the K-LCQ and used to assess the severity of cough composed of 5 factors: frequency of cough, limitation on daily activities, sleep disturbance, fatigue, and hypersensitivity to irritants 21 . All factors are scored on a single scale ranging from 0 to 4 (total scores from 0 to 20), where a higher score means more severe cough. Consequently, K-LCQ and COAT scores are highly associated with negative direction 21 . This study was conducted in accordance with Declaration of Helsinki and was approved by the Institutional Review Boards (IRB) of Ilsan Paik Hospital, Republic of Korea. Exemption of informed consent was also obtained from IRB. Statistical analysis. All the statistical analysis was performed using R version 3.6.0. Patients' characteristics are presented as mean (± standard deviation) or median (quartiles) for continuous variables and relative frequencies for categorical variables. Means were compared using t-test or analysis of variance and categorical variables were compared using chi-squared test. A p-value < 0.05 was considered to be statistically significant. For multivariable analysis, logistic regression was performed to diagnose each cause using all items of questionnaire in addition to age, sex and current smoking status; the best classification model was selected by stepwise selection method using stepAIC function in MASS package. Stepwise regression is step-by-step iterative construction of a regression model to select predictive variables by automatic procedure. In each step, potential explanatory variables are added or subtracted from the previous model and tested for statistical significance after each iteration based on prespecified criteria: Akaike Criterion. To compare the diagnostic ability of each model, area under curve (AUC) of the receiver operating characteristics curve (ROC) was calculated using ROCR pack- www.nature.com/scientificreports/ age. To validate predictive power of logistic regression, leave-one-out cross validation (LOOCV) was performed using boot package 24 . To make a decision tree for the prediction of each cause, tree package was used. Tree is a nonparametric statistical procedure containing classification by using a set of if-then-else logical conditions to assign unknown features to a predefined category. Algorithms for constructing tree work are from top to down, by choosing a variable at each step that best splits the set of items 25 . Tree creates partition recursively to increase purity in the direction to lower the impurity using Gini index. The training set and test set was divided into 7:3 ratio for cross validation, and decision tree was modeled at train set with all items of K-LCQ or COAT questionnaire in addition to age, sex, and current smoking status. Number of pruning nodes was selected by K-fold cross validation, and accuracy of tree model was validated at test set. Venn diagram and radar chart were drawn using venn Diagram function in limma package and radarchart function in fmsb package, respectively.