Introduction

Clinical presentation and outcomes in patients with coronavirus disease-2019 (COVID-19) range from asymptomatic condition to death1. A study from a non-hospitalized cohort showed that the prevalence rate of post-acute sequelae of COVID-19 at 30-days post-infection was 68.7%2. Some individuals might experience symptoms lasting 12 weeks and longer3,4. These persistent symptoms are known collectively as post-acute sequelae of COVID-19, post-acute COVID-19, post-COVID-19 syndrome, post-COVID-19 condition, or long COVID5,6,7. The Office for National Statistics in the UK showed that 11.7% of COVID-19 study participants would describe themselves as experiencing post-COVID-19 symptoms by 12 weeks after infection (based on self-classification)8. The study participants with symptoms in the acute phase had a higher percentage of self-reported symptoms 12 weeks after infection (17.7% vs. 11.7%). Such a prolonged suite of signs and symptoms may interfere with daily life and the ability to undergo routine health care9,10. Several scholars have reviewed existing literature on post-COVID symptoms to produce a summary of the current knowledge of this syndrome and have suggested that multidisciplinary care, involving the long-term monitoring of symptoms to identify potential complications, physical rehabilitation, mental health, and social services support, should be provided5,11,12.

To track symptom changes in patients with COVID-19 effectively, some countries have applied specific instruments. Research participants in the Office for National Statistics in the UK survey were asked whether they had experienced any of the 12 symptoms (fever, headache, muscle ache, weakness/tiredness, nausea/vomiting, abdominal pain, diarrhea, sore throat, cough, shortness of breath, loss of taste, and loss of smell) in the past 7 days8,9. The research respondents from the Understanding America Study—COVID-19 Survey were asked to report whether they had experienced the following symptoms in the past 7 days: fever or chills, runny or stuffy nose, chest congestion, cough, sore throat, sneezing, muscle or body aches, headaches, fatigue or tiredness, shortness of breath, abdominal discomfort, vomiting, hair-loss, dry skin, body temperature higher than 100.4 F or 38.0 °C, diarrhea, lost sense of smell, and skin rash13. Only a few studies have shown the validity and reliability of these survey questionnaires. For example, Hughes et al.14 applied Rasch analysis to develop a symptom burden questionnaire for long-term COVID. The questionnaire included 17 scales, with 131 items for clinical assessment, which may make it difficult to implement. In a study by Bahmer et al.15 the post-COVID syndrome scale, with 12 clinical symptom complexes developed by k-means clustering and ordinal logistic regression analysis, was used to survey symptom severity in Germany. However, the tool only highlights the importance of physical sequelae in different organs and systems. Current evidence supports the presence of psychological symptoms, such as anxiety or depression after acute infection in people with COVID-194,16,17,18,19,20. Based on the above, further research is required to develop user-friendly, comprehensive, and validated tools to measure symptoms changes after acute COVID-19.

Several studies have described the occurrence of the physiological and psychological symptoms of COVID-192,11,21. A pooled prevalence data showed that the 10 most reported symptoms were fatigue, shortness of breath, muscle pain, joint pain, headache, cough, chest pain, altered smell, altered taste, and diarrhea11. Cardiopulmonary symptoms, including heart palpitations or tachycardia, are prevalent and are associated with significant disability and heightened anxiety21,22. Neurological ailments (e.g. vertigo, headache, sensory deficits, numbness, anosmia, ageusia, memory issues, deficits in concentration or cognition), depression, and sleep disturbance were also found at 4–12 months after infection11,15,23,24. Recently, it was reported that some people experienced persistent and prolonged salivary dysfunction and gastrointestinal symptoms (e.g. diarrhea, lack of appetite) after the acute illness of the infection had healed12,25. Almost all studies emphasize the occurrence, but not the severity, of symptoms. The development of a new post-COVID-19 instrument to measure the severity of physical and psychological symptoms could help clinical staff to observe changes in overall symptoms and serve as an indicator to guide treatment.

Against this background, we aimed to develop and validate a set of patient-reported instruments to monitor symptom severity after the acute phase of COVID-19. We defined post-COVID-19 symptoms as one or more physical and psychological symptoms that occur in individuals with a history of COVID-19 infection, typically persisting 3 months after the onset of COVID-19, which cannot be explained by other diagnoses.

Methods

Study design

A cross-sectional, descriptive, explorative study design was used to develop the new Post-COVID-19 Symptom Scale (PCSS), which was conceived to measure symptom severity in patients with a previous diagnosis of COVID-19. The scale was developed and validated using a two-stage process at a medical college in Taiwan, from February 2022 to September 2022. The study adhered to the tenets of the declaration of Helsinki.

Instrument and procedure

Stage I: scale development (March to March 2022)

In the first stage, during scale development, items were generated and reduced. Relevant symptoms were first established via a literature review, through which we identified post-COVID-19 symptoms. Initially, a pool of 24 potential items was generated. Scales with more reference points have proven to decrease the measurement error26. Hence, responses to each item were based on an 11-point Likert scale26. A higher scale score indicated more severe symptoms. In addition, the scale was refined using ratings from five experts: we recruited five healthcare providers with expertise in ENT (Ear, Nose and Throat), infection, chest medicine, Chinese medicine, or nursing and asked them to rate the original 24 items of the new PCSS in terms of three domains: relevance, importance, and appropriateness. This rating was based on a 4-point scale; the higher the score, the more relevant, important, or appropriate the item was considered. Finally, a 24-item PCSS was generated. They then listed the reasons for revising certain items and provided specific suggestions.

Stage II: scale validation (June to December 2022)

To validate the scale, we recruited patients from a medical university hospital in Taiwan, who were older than 18 years, diagnosed with COVID-19 at least 3 months, and who could communicate in Chinese. Individuals with previous cognitive disorders (e.g. dementia) were excluded because they could not complete or answer the questionnaire. Based on the report of Tinsley & Tinsley27 regarding sample sizes (a ratio of 5 to 10 participants per item), the minimum sample size was calculated as 120. After institutional review board approval, the investigator requested a list of patients with a diagnosis of COVID-19 between March and May 2022, and their contact numbers, from the hospital. Patients diagnosed with COVID-19 in March 2022 were interviewed by telephone in July 2022 based on consideration of 3 months after COVID-19 onset. By analogy, each patient in April and May. A study consent form and questionnaire were sent after verbal consent for participation was obtained from the subjects telephonically. Informed consent was obtained from all subjects. All methods were performed in accordance with the relevant guidelines and regulations. Thirty patients with a diagnosis of COVID-19 in March 2022 were assigned to a test–retest group and were additionally asked to complete the PCSS a second time within 2 weeks of the initial survey.

After collecting data, we used exploratory factor analysis (EFA) to identify the underlying components of the PCSS items. Construct validity was assessed using a confirmatory factor analysis (CFA). Moreover, in previous studies, females under the age of 50 years were significantly more likely to report residual symptoms3,28,29,30. Accordingly, we expected that a higher PCSS would be associated with female sex or younger age. Criterion-related validity was assessed by investigating its interference in daily life, based on the definition of COVID. A six-item measurement was used to assess life interference (general activity, mood, work, relationships with others, walking, and enjoyment of life). The six questions were rated on a 0–10 scale (0 = “no interference” to 10 = “strongly interference”) and formed the criteria used in our study. A higher score indicated stronger interference. We expected that a higher PCSS score would be associated with higher scores on criterion questions.

Data analysis

Descriptive statistics (mean, standard deviation, frequency, and percentage) were used to illustrate the sociodemographic characteristics of the sample (age, sex, education, job, marital status, and religion). The missing values are replaced with the median value of the entire feature column. For item analysis, the independent t-test was used to detect whether the difference between the highest (top-27) and lowest (lowest-27) groups differed statistically (p < 0.05). A critical ratio (CR) greater than 3.5 was applied to reduce the number of items and discriminate the adequacy of each item from the subject response31. Statistically significant items with item total correlations of less than 0.30 or more than 0.85 were also deleted to reduce the number of items31.

For EFA statistics, two indicators, the Kaiser–Meyer–Olkin measure of sampling adequacy (KMO) and Bartlett’s test of sphericity, were used to determine the adequacy of the data for the factor analysis. Significantly low p values (p < 0.05) on the Bartlett test of sphericity and KMO values between 0.8 and 1 indicated that sampling was adequate32. The main analysis method was principal component analysis with varimax or direct oblimin rotation. The final factor solution was based on the results and a scree plot, eigenvalues > 1, factor loadings > 0.4, and percentage of variance explained. For construct validity, the goodness of fit of the factor structure was assessed using CFA. CFA analyses were performed using IBM SPSS Amos 21.0 (IBM SPSS Inc., Armonk, NY, USA). CFA was performed using the robust maximum likelihood estimator method (MLR). Based on a multifaceted approach to the assessment of model fit33,34,35,36 and Hoyle’s recommendations37, chi-square (χ2), normed chi-square (CMIN/DF ≈ 2), the Tuker and Lewis Index (TLI; values ≥ 0.90), comparative fit index (CFI; values ≥ 0.90), the standardized root mean square residual (SRMR; values < 0.08), and the root mean square error of approximation (RMSEA; 0.05 ≤ values ≤ 0.08 indicate a good fit) are typically considered to indicate the goodness of the model fit. The reliability of the PCSS was evaluated using Cronbach’s alpha to assess the internal consistency of each factor and overall scale. A coefficient > 0.70. considered to indicate acceptable internal consistency, and coefficients greater than 0.80 were considered to indicate good internal consistency38.

Ethics approval and consent to participate

This study was approved by the ethics review committee of Taipei Medical University Hospital (N202206052).

Results

Sample characteristics

Overall, 412 patients during the study period met our inclusion criteria. Of these, 342 agreed to participate in the survey when contacted by telephone. The main reasons for refusal were a busy work schedule (n = 37) and privacy concerns (n = 22). For 310 of these participants, completed questionnaires were collected. The recovery rate was thus 90.64%. The participants ranged in age from 18 to 87 years (43.53 ± 13.943 years), and 82.99% were female. Table 1 shows the demographic characteristics of the participants.

Table 1 Characteristics of the study participants (N = 310).

Validity

The content validity index (CVI)39 of the PCSS across expert scores was 0.80, 0.85, and 0.80, respectively. None of the final PCSS items was scored as irrelevant, unimportant, or inappropriate by the five experts. Table 2 shows the mean scores and standard deviations of the individual items. All items were retained according to the item-level analyses and were further analyzed using exploratory factor analysis. Only five factors were extracted based on the screen plot results (KMO = 0.903; Bartlett’s test of sphericity: X2 = 5242.956, df = 276, p < 0.01). Both the direct oblimin and varimax rotation data showed a five-factor solution and a clear loading pattern. The results identified five factors explained 67.00% of the total variance, with eigenvalues > 1 (Table 3). No items were removed from the scale, and the remaining 24 items were retained for further analysis. Six of the final items assessing symptoms about cardiac, pulmonary, and central nervous system comprised one factor, called “life-threatening concern” (e.g. heart palpitations or difficulty in breathing). Four items were used to assess issues that affect thinking, comprising a second factor, called “cognitive concern” (e.g. brain fog or difficulty in concentration). Eight items assessed symptoms of psychological and physical problems. They comprised the third factor, called “psychological and non-life-threatening concern” (e.g. anxiety or dry mouth). Four items assessed perceptions of musculoskeletal pain or flu-like symptoms and comprised the fourth factor, called “ache concern” (e.g. muscle pain or sore throat). Finally, two items assessed the perceived potential olfactory and taste problems. They comprised the fifth factor, called “sensory concerns” (e.g. anosmia or ageusia).

Table 2 Item analysis of post-COVID-19 symptom scale (N = 310).
Table 3 Factor loading after varimax rotation.

Furthermore, a CFA was conducted to verify the two models. First, a five-factor CFA was performed, without considering the modification index (Table 4, Model 1). Then, Model 1 with modification indices was used when the value of the modification index exceeded 20 (Model 2). The model fit indices are summarized in Table 4. Of the two models, Model 2 had the best model fit (Model 1: RMSEA = 0.093, SRMR = 0.075, CFI = 0.873, TLI = 0.855; Model 2: RMSEA = 0.073, SRMR = 0.077, CFI = 0.922, TLI = 0.910; Table 4). Model 2 results suggested that the five-dimensional model was the best model for cross-validation via CFA (Fig. 1).

Table 4 Confirmatory factor analysis fit indexes (N = 310).
Figure 1
figure 1

Confirmatory factor analysis of post-COVID symptom scale.

Regarding known-group validity, our results showed that female patients had higher PCSS scores than male patients (49.750 ± 40.695 vs. 33.623 ± 33.485; t = -3.801; p < 0.01). Patients aged < 50 years had higher PCSS scores than those aged > 50 years (47.878 ± 40.677 vs. 32.322 ± 31.054; t = -3.613; p < 0.01). Pearson’s correlation coefficient between the total PCSS and the level of the total criterion questions was 0.566 (p < 0.01). Modest and moderate correlations between each PCSS factor and each criterion question are shown in Table 5 (r = 0.255–0.761, p < 0.05).

Table 5 Correlation coefficients between criterion questions and post-COVID-19 symptom scale (N = 310).

Reliability

The reliability assessment included evaluation of internal consistency. Cronbach’s α coefficient for the 24-item PCSS overall was 0.941. Cronbach’s α coefficients ranged from 0.813 to 0.924 among the five factors. The factor-total correlations ranged from 0.497 to 0.890 (p < 0.01). The test–retest reliability coefficient for the PCSS was 0.54 (p < 0.01).

Discussion

Accurate measurement of symptom-related changes is key in psychotherapy research and in the investigation of treatment efficacy40. We here described a newly developed scale that is a generic instrument designed to measure long-term COVID, to assess both physical and psychological symptoms. The strength of this study is that the initial items were developed using literature review and interviews with clinical professionals in Taiwan. Our tool measures each symptom on a scale of 0 to 11, allowing for the long-term tracking of symptom changes and for providing feedback on treatment. Compared with previous post-COVID-19 symptom assessment instruments14,15, we performed a validation study through exploratory and confirmatory factor analysis, known-group validity, and criterion validity based on our theoretical framework. Previous studies have shown that women under the age of 50 years were significantly more likely to report residual symptoms3,28,29,30. This is consistent with our results. These findings showed that all five factors are representative and, hence, the newly designed PCSS has good construct and criterion validity, indicating that it can be used to evaluate long-term COVID. The findings also showed that the new PCSS demonstrated good reliability based on internal consistency. Therefore, the results of our study indicated that the new PCSS is highly reliable and valid for assessing symptoms in patients with long-term COVID.

Previous studies have reported that the main symptoms of long COVID are categorized by physiological systems or body parts, such as the neurological system, digestive system, or thorax symptoms15,41. Our 24-item PCSS comprises five factors. Compared to the above studies, three of the factors identified in our study corresponded to the categories defined in the above studies and show a hierarchical relationship between them. In terms of urgent treatment, the priority order was life-threatening, ache, and sensory concerns. The subfactor called "cognitive concern" in our study included not only central nervous system symptoms, but also fatigue. This might be associated with the virus causing multiple neurological, respiratory, or immune responses during the early stages of COVID, and is consistent with previous studies42,43. Our study implied that psychological and non-life-threatening concerns are reflected in the perception of disease threat and the potential association between physical and mental conditions. For example, a previous study on Asian elderly noted that depression and anxiety were associated with a number of sleep-related problems44. Patients with severe hair-loss experienced psychological comorbidities, such as depression and anxiety45. More observational studies assessing the change in symptoms are needed to examine the association between the physical and psychological impact of COVID-19 infection further. Designing and arranging therapies tailored to different timing from the onset of COVID-19 is recommended.

Our study had some limitations. All the participants were enrolled from one hospital in Taipei. We did not survey the patients at other facilities or in other countries. We recruited only 30 patients, not all participants, to establish test–retest reliability. This sampling bias might undermine the external validity of the results and may cause a selection bias. Among psychological symptoms, only anxiety and depression were measured in our study. Adding psychological dimensions of stress19 and peritraumatic distress20 to future studies may offer a more comprehensive view. Post-COVID-19 symptoms may be a linguistically and culturally sensitive measure. Whether the identified post-COVID-19 symptoms in Taiwan are consistent with those of other countries merits further study. We did not capture qualitative data from participants. Collecting data on the comprehensiveness, relevance, and user-friendliness of PCSS might provide additional insights for further refinement. In future studies, inclusion of control group without a history of COVID-19 may provide more information, such as sensitivity and specificity.

Conclusions

This study contributed to the body of evidence on the psychometric properties of long COVID. This validated study showed that the PCSS is an appropriate tool for measuring and assessing post-COVID-19 symptoms. Valid and reliable questionnaires can accurately measure the severity of the 24 items in long COVID. Misjudgement of changes in symptoms can delay the ideal opportunity for treatment. Our results suggested that this PCSS scale should be integrated into an early assessment tool to assess the severity of each symptom effectively among patients with acute COVID. The PCSS scale is useful for developing specific and effective strategies regarding care dilemmas in the clinical environment. A complete understanding of the severity of symptoms will enlighten clinical professionals, particularly, outpatient healthcare providers. More precise and specific treatment strategies are needed to overcome serious symptoms in patients with long-term COVID in future.