Pulmonary embolism (PE) is a potentially life-threatening disease with significant morbidity and potentially fatal outcomes1,2. The prognosis of PE relies on timely and accurate diagnosis, reasonable risk stratification, and well-monitored anticoagulation3. However, PE remains largely underdiagnosed due to the vague and nonspecific clinical manifestations4,5,6. Prompt diagnosis and recognition of PE patients has been shown to reduce mortality and morbidity7.

The clinical presentation of PE is very nonspecific and related to a broad list of symptoms8. In fact, the classic clinical symptoms (e.g. the triad syndrome including hemoptysis, dyspnea, and pleuritic pain) were not common and not easily recognized during current clinical practice6. The overall presentation could be easily confused with systemic disorders or other cardio-pulmonary diseases9. It was also important to realize that PE patients might be asymptomatic which further underdiagnosed this disease10.

In the process of diagnosis, clinical symptoms were recommended to be firstly assessed ideally by a validated prediction model11, although final diagnosis should be mainly based on clinical findings, laboratory tests, and imaging data. Previous studies demonstrated that dyspnea followed by pleuritic pain and cough were the most common symptoms of PE12. Surprisingly, a large amount of patients, including those with massive PE, had mild or nonspecific symptoms or even were asymptomatic13. For instance, one systematic review including 28 studies revealed that one-third of 5,233 patients with a deep vein thrombosis (DVT) had asymptomatic PE14. Moreover, some patients had a delayed presentation of related symptoms over weeks or days15. Therefore, well understanding of typical clinical symptoms associated with PE, their potential predictors, and potential co-occurrence and combination of these symptoms would improve diagnosis and prognosis of PE. However, few studies had focused on this topic and no studies were found to cluster PE symptoms and patients to date. Few related studies were all based on Western populations6,16, and Chinese patients might have different characteristics due to racial differences17,18.

Based on a large sample of PE patients in a comprehensive hospital in China, this study aims to assess typical clinical symptoms, to identify related clinical and laboratory indicators of these symptoms, to group the patients into different clusters based on the presentations of symptoms, and to identify the principal components of these symptoms. This study will improve our understanding of clinical symptoms and their potential combinations which are helpful for clinical diagnosis of PE.


Ethics statement

This study was approved by the Medical Ethics Committee of Affiliated Dongyang Hospital of Wenzhou Medical University (Dongyang, China). Informed consent was obtained from all enrolled patients. Patients records/information was anonymized and de-identified prior to analysis. All experiments were performed in accordance with relevant guidelines and regulations.

Subjects and sample collection

Clinical data from a total of 551 patients hospitalized at the Affiliated Dongyang Hospital of Wenzhou Medical University between January 2012 and April 2016 were retrospectively reviewed. According to the criteria of European Society of Cardiology Guidelines, PE was confirmed by an identified filling defect in the pulmonary artery system in a CT pulmonary angiography (CTPA) or a positive venous ultrasound of extremity deep venous thrombosis (DVT) in patients with typical symptoms of PE (Pleuritic pain or dyspnea). In addition, a positive D-dimmer or ventilation-perfusion (V/Q) scintigraphy supports a high probability for PE, as recommended in those guidelines. All clinical data including symptoms, vital signs, comorbidities, and laboratory indicators were collected on admission. Ten symptoms were included in the analysis including dyspnea, tachypnea, hemoptysis, pleuritic pain, hydrothorax, syncope, hypoxia, fever, cough, and edema in lower extremities. Concurrent chest radiographs, CT scan and echocardiographic examination were performed and recorded.

CTPA examination

The thoracic CT scans were performed using a 64-detector multi-sectional CT scanner (Brilliance 64-slice; PHILIPS, Amsterdam, Netherlands) with an intravenously injected contrast agent. Scanned with multi-slice spiral CT, collimation of 0.6, rotation time of 0.5 s, slice thickness of 5 mm, and pitch of 1.0, contrast agent (100 ml) was injected at 4 ml/s. CTPA results were categorized as positive for PE if an intraluminal filling defect was observed within a pulmonary arterial vessel and were considered negative if no filling defect was seen. Scans were considered technically inadequate only if main or lobar pulmonary vessels were not visualized.

Statistical Analysis

Characteristics were presented as mean ± standard deviation (SD) for continuous variables and percentage for categorical variables. Multivariate logistic regression models were used to identify the associations between clinical indicators/predictors and PE symptoms. System clustering method was used to identify typical clusters of PE patients based on the symptoms. In this method, Euclidean distance was used to measure similarity of objects in the symptoms of PE Patients, and Ward’s method was performed to group objects to clusters. Four cluster evaluation statistics including R-squared, semi-partial R-squared, Pseudo F and Pseudo T-squared were plotted in the hierarchical analysis to determine the optimal number of clusters. According to the plots, we usually defined 4–6 clusters. Within this range of number of clusters, we chose the number with smaller semi-partial R-squared, larger R-squared, larger Pseudo F, and smaller Pseudo T-squared statistics. Chi-square Fisher’s exact test was utilized to determine whether cluster membership was significantly associated with different clusters. To determine the extent to which the clusters based on presentations of PE correspond to differences in other indicators, a series of one-way ANOVA tests were performed for other indicators with different clusters. Principal component analysis (PCA) was used to identify potential principal components of PE symptoms. All principal components with eigenvalue greater than 1 were remained. All significance levels were set at 0.05. Statistical analyses were run in SAS 9.4.


Characteristics of the study population

A total of 551 patients with PE were included in this study. Table 1 presented the characteristics of the study population. Patients presented a broad range of clinical symptoms. Among all symptoms, dyspnea was the most common presenting symptom in 64.1% of cases with PE, followed by cough (60.4%), tachypnea (60.4%) and hypoxia (57.9%), while the other symptoms included hydrothorax, edema in lower extremities, fever, syncope, hemoptysis and pleuritic pain, accounting for 26.9%, 23.6%, 22.1%, 12.7%, 11.8% and 8.2%, respectively. PE patients were usually combined with hypertension (45.2%) and chronic obstructive pulmonary disease (COPD) (32.1%).

Table 1 Demographic and clinical characteristics of study subjects (N = 551).

Associations between comorbidities and symptoms

The results of associations between comorbidities and each symptom were shown in Table 2. The PE patients with atrial fibrillation (AF) were more likely to present tachypnea, hemoptysis, and hydrothorax compared to those without AF. The patients with coronary heart disease (CHD) experienced higher odds of presenting dyspnea and tachypnea, and those with CDPD experienced higher odds of presenting dyspnea, tachypnea, hypoxia, and cough. The patients with lower extremities thrombosis were more likely to present edema in lower extremities. However, the PE patients with hypertension or diabetes, COPD, and lower extremities thrombosis experienced lower odds of presenting cough, syncope, and dyspnea, respectively.

Table 2 The associations between comorbidities and presentation of symptoms from logistic regression models (N = 551)

Associations between laboratory indicators and symptoms

The results of associations between some laboratory indicators and each symptom were summarized in Table 3. The counts of white blood cell (WBC), neutrophilic granulocyte, red blood cell (RBC), and D-dimer (D-D) were usually associated with increased odds of presenting these symptoms. For example, the count of WBC was positively associated with presenting of dyspnea, tachypnea, hydrothorax, syncope, hypoxia, and fever. The level of albumin (ALB) was usually associated with decreased odds of presenting symptoms including dyspnea, tachypnea, hydrothorax, fever, and cough. Hemoglobin and platelet were occasionally associated with some symptoms. Arterial carbon dioxide tension (PPOCD) and three tip valve regurgitation (TTVR) were associated with increased odds of presenting several symptoms, while the PH value was associated with decreased odds of presenting dyspnea, tachypnea, hydrothorax, and syncope (Table 4).

Table 3 The associations between laboratory indicators and presentation of symptoms from logistic regression models (N = 551)
Table 4 The associations between blood gas/ultrasonic indicators and presentation of symptoms from logistic regression models (N = 551)

Five clusters of patients from cluster analysis

Based on ten main symptoms presented among our PE patient sample, we generated five clusters of patients according to cluster analysis approach. We evaluated the cluster analysis by R-squared, semi-partial R-squared, Pseudo F and Pseudo T-squared statistics, and the plots results suggested noticeable improvement at around five clusters. Thus, five is the most optimal number, where both the semi-partial R-squared and Pseudo T-squared were relatively small, and both R-squared and Pseudo F were relatively large (Figure 1). The cluster history results were presented in Table 1S in the supplementary file. Table 5 showed the number and proportion of each symptom in each of five clusters. Almost half of the patients (N = 250) were classified into Cluster 3, and a quarter of those (N = 136) were classified into Cluster 2. Among patients in Cluster 3, almost all presented dyspnea (98.8%) and tachypnea (92.8%), and most of them presented hypoxia (70.8%) and cough (71.6%). For Cluster 2, we could find that the patients have no specific/typical symptoms. Among patients in Cluster 1, all presented syncope, and almost half of those presented dyspnea, hypoxia, tachypnea, and cough, but few presented pleuritic pain, hemoptysis, and edema in lower extremities. Table 6 summarized age, gender, comorbidities, and clinical indicators related to PE by five clusters. For example, in Cluster 3, the patients were the oldest, were most likely to have COPD, had the lowest arterial oxygen tension (PPO), the highest PPOCD, and pulmonary artery pressure (PAP).

Figure 1
figure 1

Evaluation of cluster analysis results.

Table 5 Prevalence of symptoms by the clusters of patients with pulmonary embolism (N = 551).
Table 6 Prevalence of symptoms by the clusters of patients with pulmonary embolism (N = 551)

PCA of PE symptoms

The principal components with eigenvalue greater than 1 were retained to account for as large as possible proportion of the total variability in the component measures. The eigenvalues of the correlation matrix from PCA were shown in Table 1S, and the graph results of PCA were shown in Fig. 2. The loadings of ten symptoms for all principal components were presented in Table 7. In the first principal component which accounted for 25% of the variance, the symptoms with large loadings included tachypnea, dyspnea, cough, hydrothorax and hypoxia.

Figure 2
figure 2

The graph results of PCA.

Table 7 Eigenvectors of ten principal components from PCA


In our study, the most common symptoms of PE were dyspnea, cough, and tachypnea in more than 60% of patients. Some combined chronic conditions, and laboratory and clinical indicators were found to be related to these clinical symptoms among PE patients. The present study also suggested that PE is associated with a broad list of symptoms and some PE patients might share similar symptoms. Based on ten symptoms generated from our sample, we classified the patients into five clusters which represent five groups of PE patients during clinical practice. Four principal components of symptoms were identified and the tachypnea, dyspnea, cough, hydrothorax, and hypoxia were the most common symptoms in the largest principal component which accounted for 25% of variability of the PE symptoms.

The diagnosis of PE remains challenging, particularly due to the absence of commonly associated symptoms and signs in this disease. The majority of well-known symptoms in our sample are similar in prevalence to those described in prior studies2,5,6,18,19,20,21,22,23. However, the results are still inconsistent across previous studies. For example, some studies showed that pleuritic pain and hemoptysis were the most frequent mode of presentation in PE patients2. A recent study indicated that most PE patients featured at least one of the four following symptoms: sudden onset dyspnea, pleuritic pain, syncope, and hemoptysis11. In the present study, however, fewer PE patients had pleuritic pain and hemoptysis. Hemoptysis has been traditionally taught as a classically described symptom in the presentation of PE12,20,24,25,26,27. Previous studies have reported that the occurrence of hemoptysis in PE is to be as high as 20–25%12. However, in our study, hemoptysis was noticed only in 11.8% of PE patients. We hypothesize that the decrease in the incidence of hemoptysis might be related to the wide availability of CT scans, allowing of early detection and timely anticoagulation in patients with PE, preventing further progression of the disease with resultant pulmonary infarction28,29. These differences could also be explained by the different distribution of age and population27.

It is interesting that dyspnea presenting in PE patients, was often accompanied by other symptoms like tachypnea, hypoxia, cough, and hydrothorax. For example, in Cluster 3 of the five clusters generated by cluster analysis, almost all patients in the cluster presented dyspnea and tachypnea, and most presented hypoxia and cough, and some presented hydrothorax. From PCA results, these five symptoms were with the largest loadings in the first principal component. A reasonable explanation was that PE was more likely to be incidentally detected when patients had obvious symptoms28. In addition, we recognized that almost half of the PE patients in Cluster 3 were with COPD. COPD patients had a significantly increased risk of dyspnea than those without COPD. The prevalence of dyspnea in PE patients with COPD was 91.3% in this study, which was similar to a latest study30. Actually, the patients in Cluster 3 were usually difficult to differentiate with COPD patients. However, if the duration and severity of dyspnea and tachypnea felt different than usual COPD conditions, or the symptoms were not improved after treatment, or the hypoxia was not improved after treatment, the physician should highly suspect the occurrence of PE.

In our study, we also found that in Cluster 1, PE patients with syncope were as the main presentation, accompanied by high oxygen partial pressure (PPO about 96%), and 92.3% of the patients were without edema. The syncope was usually accompanied by hypoxia which was not common in other nervous system diseases. The possible explanation was that embolus sudden blockage in the lung. Such patients should be paid enough attention because of their sudden onset of unknown, lack of typical presentations of PE, and easy to cause misdiagnosed. Therefore, we suggest that patients with this type of presentation should be diagnosed as early as be suspected, and diagnostic performance of PE such as D-dimer testing and CTPA, should be applied for these patients. In Cluster 2 PE patients, there were no typical symptoms. This cluster might be the explanation for under-diagnosis of PE. Possible reason for without typical presentations was that embolus did not fall off to the lungs to formation of PE, or because the patients were on anticoagulant medicine after the formation of lower limb thrombosis. PE patients with cancers and PE patients from obstetrical department usually have no obvious symptoms during PE occurrence and might belong to this cluster. Among PE patients in Cluster 4, the most typical presentation of PE were hemoptysis and cough, accompanied by smoking and COPD. Hemoptysis generally indicated massive PE, but at present PE appeared relatively small probability because of advances in diagnostic techniques. In our study, we found only 11.8% PE patients were with hemoptysis. Researchers recently reported low rates of hemoptysis of PE (only about 5%)17,18. The possible reason for the difference might be due to different procedures and level of diagnosis techniques and clinical skills of physicians between our hospital and hospitals from western world. Therefore, we believe that improving the detection and diagnosis of PE, early intervention and treatment would help to reduce the emergence of hemoptysis in patients with PE. Among PE patients in Cluster 5, the most typical presentations of PE were pleuritic pain and cough, accompanied by fever, dyspnea, and tachypnea. Clinicians should pay attention to differentiate such patients with acute myocardial infarction. Pleuritic pain of PE was difficult to describe, and if the pain could not be explained by myocardial infarction or other related diseases, physicians should consider the possibility of PE.

All these findings suggested that more attention should be taken into the under-diagnosis of PE. A timely detection of PE is an essential prerequisite of a prompt effective treatment31. The current study indicated that clinical symptoms combined with risk factors might provide useful information in identifying highly susceptible PE patients although clinical manifestations of PE were often nonspecific. The clinical presentation also varies depending on the distribution and size of emboli occluding the pulmonary vasculature, as well as the age and pre-existing co-morbidities of the patients32,33. This study identified associated signs and symptoms, clinical risk factors associated with the presentation of PE, which were helpful to aid physicians on the diagnosis of this dangerous and potentially fatal disease. Based on our PCA results, we could establish a scoring system using the loadings of ten symptoms in the largest principal component. However, the overall proportion of the first principal component was only approximately 25%, and the scoring system from this component would not be sensitive and specific enough to diagnose PE patients. However, our study was a good attempt to cluster and clean PE symptoms, and we believe we could develop an accurate scoring system after we accumulate more and more data. As development of artificial intelligence and machine learning techniques, it is possible to deeply study these symptoms and their interactions and combinations and to improve the diagnosis of PE. Moreover in our future study, we aim to analyze all suspicious patients, demonstrate the risk factors of PE incidence, and construct the risk scoring system for PE incidence but not PE patients only.

There were some limitations in this study. The major limitation of our study is its retrospective design. Data collection was based on information available on review of the patient medical records. Second, we did not investigate the etiology of PE in this retrospective study. Third, the clinical findings could be well under-represented due to its dependency on physician assessment and documentation of clinical findings leading to recall bias. Further studies are therefore necessary to validate the diagnostic value of clinical characteristics.

In conclusion, different symptoms were associated with different clinical indicators among PE patients. PE patients could be grouped into different clusters of typical symptoms, which would improve accuracy of diagnosis and prevent adverse events due to delayed diagnosis. The diagnosis of PE remained a challenging task, our results will improve our understanding of clinical symptoms and their potential combinations which are helpful for clinical diagnosis of PE.