Introduction

Post-acute sequelae of SARS-CoV-2 (PASC), also known as long coronavirus disease 2019 (long COVID-19), represents a growing concern in public healthcare. While consensus for the clinical definition of PASC is evolving, the National Institute for Health and Care Excellence (NICE) defines PASC as signs and symptoms that develop during or following an infection consistent with COVID-19, continue for more than four weeks and are not explained by an alternative diagnosis1,2. Several prospective cohorts of PASC have been described, each differing in case definition, size, and composition of the study population, symptoms evaluated, as well as follow-up frequency and duration. The prevalence of PASC is especially high amongst those COVID-19 patients who needed hospitalization: according to some reports3, up to half of hospitalized patients reported at least one physical, cognitive, or mental impairment, months after COVID-19 diagnosis. PASC is, thus, having a substantial impact on quality of life, healthcare costs, and economic productivity4.

The immunobiology of PASC is currently under intensive investigation with some leading hypotheses5 invoking the persistence of viral components driving immune stimulation, reactivation of viral infections such as EBV, dysbiosis of microbiome or virome, unrepaired tissue damage, and autoimmunity6,7.

Here, we describe patient-reported recovery data prospectively assessed from the acute infection through one year after hospital discharge for COVID-19. We identify predisposing factors and immune profiles from the acute phase of the disease that are associated with impaired clinical and functional recovery during the year following hospital discharge.

Results

Demographics and descriptive statistics

In all, 1164 participants were enrolled between May 5th, 2020 and March 19th, 2021 and followed up to 28 days while hospitalized. Of the 702 participants who survived hospitalization and were alive and on study at 3 months post discharge, 590 (84%) completed at least one quarterly set of surveys post discharge (survey respondent cohort) (Fig. 1A) with 29% (170) completing all 4 quarterly surveys and most completing 2 or more surveys (494; 84%)7. The participants who responded appear to be demographically representative of the entire cohort but were less likely to have prolonged hospitalization or discharge limitations compared with non-responders (42% versus 57%)8. Demographics, clinical characteristics, baseline radiographic and laboratory findings, as well as main outcomes during the COVID-19 hospital stay are provided for the survey respondent cohort in Table 1. The median age was 57 years (IQR 19), and 360 (61%) were men. 131 (22%) were Black/African American, and 189 (32%) were Hispanic/Latinx. 94% of participants had at least one comorbidity, most commonly hypertension (327; 55%) and diabetes (200; 34%). The median body mass index (BMI) was 31.8 kg/m2 (IQR 27.4–37.0). Two hundred and forty-five (52%) had an elevated baseline C-reactive protein (CRP) (≥10 mg/L) and 297 (50%) had an abnormal baseline D-dimer (>0.5 mg/L) upon hospital admission. Only 137 (25%) had no infiltrate on chest imaging upon hospital admission. One hundred and forty-one (24%) did not receive any oxygen therapy and 159 (27%) received ICU level care while inpatient. Four hundred (68%) received steroids and 377 (64%) received remdesivir. The median length of stay was 6 days (IQR 4–10) and most study participants had at least one complication (490; 83%) while inpatient. None of the participants had received a COVID-19 vaccine prior to admission; after discharge, 62% reported receiving the primary vaccine series and 36% a single booster dose. Three hundred and five (52%) out of 590 reported at least one symptom during the quarterly surveys, most commonly dyspnea (29%), followed by muscle aches/myalgia (21%), cough (20%), headache (19%), and fatigue/malaise (18%) (Table 1) (Figs. 1S and 2S). Thirty-two percent had symptoms affecting more than one organ system.

Fig. 1: Survey completion and clustering of participant-reported outcomes after hospital discharge (N = 590).
figure 1

A Upset plot depicting the number of participants completing surveys at 3 (m3), 6 (m6), 9 (m9), and 12 months (m12) after hospital discharge. B Radar plot showing relative deficit for each of four different clusters across several participant-reported outcomes: EQ-5D-5L Health Recovery Score (Health), PROMIS Cognitive Function Score (Cognitive), PROMIS Psychosocial Illness Impact Positive Score (Psychosocial), PROMIS Global Mental Health Score (Mental), PROMIS Dyspnea Score (Dyspnea), and PROMIS Physical Function Score (Physical). The radial axis denotes a t-statistic comparing the within-cluster mean to the remaining sample, with t = 0 denoting the overall sample mean and negative values denoting a deficit. The 4 clusters are: solid gray, minimal deficit (MIN); blue line, physical predominant deficit (PHY); yellow line, mental/cognitive predominant deficit (COG); and red line, multidomain deficit (MLT). PROMIS Patient-Reported Outcomes Measurement Information System.

Table 1 Baseline characteristics of the survey respondent cohort

Latent class and cluster analysis

Fitting latent class mixed models (LCMMs) derived from the longitudinal patterns to each of eight participant-reported outcomes (PROs), we selected quadratic models with three groupings for EQ-5D-5L, Health Recovery Score, and PROMIS Dyspnea Score, and a linear model with three groupings for PROMIS Cognitive Function Score. There were no distinct groupings for the PROMIS Physical Function Score, PROMIS Global Mental Health Score, and PROMIS Psychosocial Illness Impact Positive Score. Median (IQR) values by visit for each PRO grouping are shown in Fig. 3SA–G.

We identified the Ward algorithm with six clusters as the optimal model. Fitting statistics across the five algorithms are shown in Fig. 3SH. Comparison of t-statistics across the PROs identified no associated deficit with certain clusters, which we then collapsed into a single cluster which we labeled minimal deficit (cluster MIN, 358 participants, 60.7%, Table 1A–S). Based on associations with specific PROs, clinical phenotypes were defined by labeling the remaining three clusters as physical predominant (cluster PHY, 92 participants, 15.6%), mental/cognitive predominant (cluster COG, 82 participants, 13.9%), and multi/pan domain deficit (cluster MLT, 58 participants, 9.8%). Table 1B–S and Fig. 1B, respectively, show a table and radar plot with the t-statistics for each PRO across these four clusters, which were named based on the predominant deficit for a specific PRO.

Selected demographic characteristics, key comorbidities, and laboratory findings were significantly associated with the four PRO clusters by bivariate analysis (Table 1). Acute phase disease severity, whether defined by respiratory score at admission, SOFA score at admission, ICU utilization, mechanical ventilation, or inpatient overall clinical trajectory, was not associated with PRO cluster assignment (Table 1). Any use of remdesivir and steroids in the inpatient period was not associated with a decrease in PASC prevalence (Table 1). Adjusted multinomial logistic regression analyses comparing to participants with the minimal deficit cluster indicated that participants in the PHY cluster were more likely to have reported comorbidities of chronic pulmonary disease (OR 2.46 95% CI 1.41–4.29) or chronic neurologic disorder (OR 2.13 95% CI 1.20–3.78) and less likely to be males (OR 0.55; 95% CI 0.35–0.87) and non-white race (OR 0.66; 95% CI 0.47–0.93). Relative to participants in the MIN cluster, participants with mental/cognitive predominant deficit were less likely to be 65 years or older (OR 0.41; 95% CI 0.18–0.94), less likely to be males (OR 0.54; 95% CI 0.36–0.82), more likely to have chronic cardiac disease (OR 1.72; 95% CI 1.02–2.88) and had longer acute hospitalization (OR per week 1.28; 95% CI 1.12–1.46). Participants with MLT deficit were more likely to have chronic pulmonary disease (OR 1.78; 95% CI 1.01–3.13) or chronic neurologic disorder (OR 4.37; 95% CI 2.14–8.94), received less oxygen supplementation OR = 0.54 (95% CI: 0.34, 0.87) and had longer acute hospitalization (OR per week 1.44; 95% CI 1.19–1.75) relative to those in the MIN cluster (Fig. 2).

Fig. 2: Forest Plot showing adjusted odds ratios (ORs) for factors associated with patient-reported outcome (PRO) clusters with more deficits compared to minimal deficit, using multivariable multinomial logistic regression (N = 590).
figure 2

A Comparison of PRO clusters PHY, COG, and MLT with PRO cluster MIN. B Comparison of PRO clusters PHY and MIN (left); clusters COG and MIN (middle); clusters MLT and MIN (right). MIN minimal deficit, PHY physical predominant deficit, COG mental/cognitive predominant deficit, MLT multidomain deficit, MV Mechanical ventilation, ECMO Extracorporeal membrane oxygenation.

Laboratory assays by PRO cluster

N1 gene SARS-CoV-2 PCR cycle threshold (Ct) values up to 28 days since admission differed significantly among the four PRO clusters (Fig. 3 and Fig. 4S). Specifically, the viral RNA levels from the respiratory tract were significantly higher (lower Ct values) throughout those 28 days in participants within both the PHY and MLT clusters compared the MIN cluster in the longitudinal GAM model (Fig. 4S, adj. p = 0.015). There was no difference in Ct values between clusters with MIN and COG deficits. Viral levels declined over time in all four PRO clusters and the rate of viral clearance did not differ significantly among the PRO clusters (Fig. 4S). Similar trends were also observed with N2 SARS-CoV-2 PCR Ct values (Fig. 4S, adj. p = 0.013).

Acute phase anti-SARS-CoV-2 RBD IgG and Spike IgG titers, were also significantly associated with PRO clusters (Fig. 3 and Fig. 4S). Both anti-RBD IgG and anti-Spike IgG levels were significantly lower in the MLT compared to the MIN and COG clusters by the GAM model (Fig. 4S, adj. p = 0.023) and in the PHY compared to the MIN and COG clusters by the GLM model (Figs. 3 and 4S, adj. p = 0.014). There was a swift rise in antibody (Ab) levels during first 7 days followed by a modest increase between 7 and 20 days, reaching a plateau at day 28 in all PRO clusters. We also observed significantly faster rise in Ab levels between 7 and 20 days for clusters PHY and MLT compared to the other 2 clusters (Fig. 4S; shape, anti-RBD IgG adj. p = 0.005 and anti-Spike IgG adj. p = 0.0017). The ratio of anti-RBD IgG to N1 Ct followed similar patterns (Fig. 4S).

Fig. 3: SARS-CoV-2 viral RNA levels and antibody responses.
figure 3

A N1 Ct values: shown are SARS-CoV-2 N1 gene PCR cycle threshold (Ct) values (viral loads) measured from samples collected during the first 28 days of hospital admission by four PRO clusters, minimal deficit (MIN, n = 657), physical predominant (PHY, n = 174), deficit, mental/cognitive predominant (COG, n = 172 and deficit, multidomain (MLT, n = 112). Shown are median values (horizontal lines), interquartile ranges (boxes), and 1.5 IQR (whiskers), as well as all individual points. Because lower Ct values indicate higher viral loads, the y axis is reversed. The viral loads were significantly (adj. p = 0.03) associated with the PRO clusters. B anti-RBD IgG values: Shown are anti-RBD IgG values measured from samples collected during the first 28 days of hospital admission by four PRO clusters, minimal deficit (MIN, n = 907), physical predominant (PHY, n = 221), deficit, mental/cognitive predominant (COG, n = 230) and deficit, multidomain (MLT, n = 149). Shown are median values of area under the curve (AUC) (horizontal lines), interquartile ranges (boxes), and 1.5 IQR (whiskers), as well as all individual points. The titers were significantly (adj. p = 0.014) associated with the PRO clusters. C Ratio of anti-RBD IgG to N1 values: shown are scaled ratio of anti-RBD IgG to SARS-CoV-2 viral loads (N1 gene) values from samples collected during the first 28 days of hospital admission by four PRO clusters, minimal deficit (MIN, n = 560), physical predominant (PHY, n = 156), deficit, mental/cognitive predominant (COG, n = 141) and deficit, multidomain (MLT, n = 99). Shown are median values (horizontal lines), interquartile ranges (boxes), and 1.5 IQR (whiskers), as well as all individual points. The ratio of titers to viral loads was also significantly (adj. p = 0.05) associated with the PRO clusters. The four PRO clusters are the following in gray: minimal deficit (MIN), in blue: deficit, physical predominant (PHY), in yellow: deficit, mental/cognitive predominant (COG), and in red: deficit, multidomain (MLT). The lines and asterisks on top of the figure denote pairwise statistical significance, *p < 0.05, **p < 0.01, ***p < 0.001. Statistical differences were determined from generalized linear mixed effects models adjusting for age, sex, participant, and enrollment site. P values were adjusted using the Benjamini-Hochberg method to account for multiple comparisons. See Methods for more details.

We next considered whether circulating leukocyte subset frequencies during the acute phase correlated with PRO clusters and carried out deep immunophenotyping using CyTOF to quantify cell frequencies in whole blood. Our analysis of cell subsets revealed that in the first 28 days of hospitalization, differences in the frequency of circulating B lymphocytes were significantly associated with PRO clusters (Fig. 4, adj. p = 0.0191, Participants with MLT deficits (adj. p = 0.0005) and COG deficits (adj. p = 0.025) had a significantly lower frequency of circulating B cells compared to those with MIN deficits (Fig. 4). Notably, within B cells, participants with MLT deficits showed a lower frequency of naïve B cells compared to MIN cluster, although the difference did not remain significant after multiple-comparison correction for all B cell subsets (Fig. 5S). Other immune cell subtypes were not significantly associated with PRO cluster. The ratios of SARS-CoV-2 PCR Ct values and antibody titers to B cell numbers are shown in Fig. 6S, indicating that the lower frequency of circulating B cells may be relevant to outcome in the context of distinct kinetics of Ab production and viral clearance.

Fig. 4: B cell to non-granulocyte frequency.
figure 4

Shown are B cell to non-granulocyte frequency values from samples collected during the first 28 days of hospital admission by four PRO clusters, minimal deficit (MIN, n = 584), physical predominant (PHY, n = 140), deficit, mental/cognitive predominant (COG, n = 145) and deficit, multidomain (MLT, n = 107). Shown are median values (horizontal lines), interquartile ranges (boxes), and 1.5 IQR (whiskers), as well as all individual points. The repeated-measurement model identified significant differences of B cell to non-granulocyte frequency in association with convalescent clusters (adj. p = 0.0191). The 4 clusters are the following in gray: minimal deficit (MIN), in blue: deficit, physical predominant (PHY), in yellow: deficit, mental/cognitive predominant (COG), and in red: deficit, multidomain (MLT). The lines and asterisks on top of the figure denote pairwise statistical significance, *p < 0.05, **p < 0.01, ***p < 0.001. Statistical differences were determined from generalized linear mixed effects models adjusting for age, sex, participant, and enrollment site. P values were adjusted using the Benjamini–Hochberg method to account for multiple comparisons. See Methods for more details.

Autoantibodies with blocking activity against type I interferons (alpha, beta, and omega) were observed at the onset of hospitalization in 4.3% participants (24 out of 563) across four PRO clusters (Tanle 2S). Anti-interferon autoantibodies were detected in 5.3% of males and 2.7% female, (P = 0.14) and were more common in individuals older than 65 years of age (8.9% older vs 2.8% younger, P = 0.039). We observed a proportionately larger fraction of individuals with positive autoantibodies against type I IFNs from PHY and MLT PRO clusters (PHY cluster = 6.9%, MLT cluster = 7.3%) compared to the other 2 clusters (MIN cluster = 3.2%, COG cluster = 3.8%) (p = 0.2). In a matched case-control (1:3) based on age and sex comparison, individuals with IFN autoantibodies had significantly higher viral loads than individuals without IFN autoantibodies (Fig. 7S, N1 Ct P = 0.012 and N2 Ct P = 0.006). We did not see significant differences in Ab titers against either SARS-CoV-2 RBD IgG or spike IgG levels between these matched groups.

Using a panel of 92 inflammatory markers, we analyzed serum samples using a proximity extension assay (Olink) across the period of acute illness; one protein, fibroblast growth factor 21 (FGF21), was significantly elevated in the COG cluster (adj. p = 0.0025) as well as the MLT cluster (adj. p = 0.000033), relative to the MIN cluster. The highest mean FGF21 values were in the MLT cluster (Fig. 5, Fig. 8S).

Fig. 5: Circulating fibroblast growth factor 21 expression.
figure 5

Circulating fibroblast growth factor 21 (FGF21) NPX (Normalized protein expression): Shown are FGF21 NPX values from samples collected during the first 28 days of hospital admission by four PRO clusters, minimal deficit (MIN, n = 716), physical predominant (PHY, n = 189), deficit, mental/cognitive predominant (COG, n = 210) and deficit, multidomain (MLT, n = 139). Shown are median values (horizontal lines), interquartile ranges (boxes), and 1.5 IQR (whiskers), as well as all individual points. The generalized additive model (GAM) identified a significant difference in FGF21 expression level in association with convalescent cluster groups (adj. p = 0.0135). The four clusters are the following in gray: minimal deficit (MIN), in blue: deficit, physical predominant (PHY), in yellow: deficit, mental/cognitive predominant (COG), and in red: deficit, multidomain (MLT). Statistical differences were determined from generalized linear mixed effects models adjusting for age, sex, participant, and enrollment site. P values were adjusted using the Benjamini–Hochberg method to account for multiple comparisons. See Methods for more details.

Analyzing 658 serum metabolites, 27 modules were identified from a weighted gene correlation network analysis (WGCNA) which corresponded to each metabolite feature. We observed a significant difference in shape (referred to as the smoothing term in the gamm4 documentation) for methylhistidine metabolism (global metabolomics module 3) and acylcarnitine metabolism (global metabolomics module 18) among the PRO clusters (Fig. 9SASB). Notably, significantly lower levels of metabolites related to methylhistidine metabolism were observed for participants in the PHY and the MLT clusters, compared to the MIN cluster (shape adj. p = 0.049). Further, significantly higher levels of metabolites related to acylcarnitine metabolism were observed for participants in the PHY cluster, compared to the MIN cluster (adj. p = 0.049).

Discussion

In this large prospective study that followed participants from the time of acute COVID-19 hospitalization, more than half of the participants hospitalized with COVID-19 had persistent symptoms lasting 3 or more months after discharge, consistent with other studies9. The clustering of Patient -Reported Outcomes (PROs) by predominant deficit (physical predominant, mental/cognitive predominant, and multidomain deficits) supports PASC as a heterogeneous clinical entity with distinct sub-phenotypes associated with unique perturbations of the immune system in the acute phase of the illness10,11. This cluster-based analysis also revealed a specific participant phenotype associated with persistent mental and cognitive impairments that were distinct from phenotypes associated with broader physical dysfunction. These findings suggest tailored approaches will be needed in the management of PASC12,13,14.

Overall demographic and clinical risk factors for PASC in our cohort include female sex, comorbidities13,15,16 such as chronic heart, lung, or neurologic disease, as well as longer length of hospital stay17. The biological basis for why females may be more susceptible to PASC than males has yet to be defined, though several models have been proposed18,19. One potential mechanism is autoimmunity, though we did not find any difference between males and females in autoantibodies against type I interferons. Hormonal factors may also play a role in perpetuating the hyperinflammatory status of the acute disease phase even after initial recovery20. While certain comorbidities are associated with PASC (e.g., chronic pulmonary diseases in the physical predominant deficit cluster), it is unclear based on our data if some of the PASC disease burden could be misattributed to COVID-19 or that COVID-19 accentuates these pre-existing conditions. Consistent with prior reports in non-hospitalized patients, we found no association between PASC and acute COVID-19 disease severity21,22,23,24, but a longer length of stay was associated with all PRO deficit clusters when compared to those with minimal functional deficits, similar to findings from a population-based cohort study25. Interestingly, our results suggest that supportive interventions such as oxygen therapy may be associated with a lower likelihood of being in the multidomain deficit PRO cluster, supporting the notion that early acuity-based interventions may positively influence clinical outcomes26

Our data demonstrate that higher SARS-CoV-2 viral burden and lower Ab titers during the acute phase are associated with both the physical predominant deficit as well as the multidomain deficit PRO clusters. Of note, these virologic and serologic findings do not distinguish participants with minimal deficits from those in the mental/cognitive predominant deficit PRO cluster, suggesting that different factors could lead to the development of this particular cluster. The described calculated ratio (IgG/Ct value) is unique not only in the acute phase to determine the trajectory of acute disease course but also associates with PRO clustering in the convalescent phase8, and thus may represent a practical approach for patient risk stratification for both early mortality and subsequent morbidity from PASC. Our data confirm findings from other studies suggesting that PASC is associated with initial high SARS-CoV-2 RNA levels27 and a suboptimal serological response28,29,30 consistent with a reduced number of circulating B cells mostly in the multidomain deficit cluster. The therapeutic and protective effects of immunoglobulins were the basis for the pre-Omicron use of convalescent plasma and monoclonal antibodies (mAbs) and are the basis for the continued use of vaccination in the treatment and prevention of COVID-19. Recent observations of PASC resolution after SARS-CoV-2 vaccination raise the possibility of depletion of persisting viral reservoirs31. Since none of our study participants (enrolled in 2020 and early 2021) were vaccinated prior to their illness and only 6% received convalescent plasma, it was not possible to determine the impact of early Ab therapy or vaccination on the likelihood of PASC. Regarding other COVID-19-directed interventions, use of remdesivir and steroids in the inpatient period was not associated with a decrease in PASC prevalence. However, a recent study reported a protective effect against PASC when another antiviral, nirmatrelvir/ritonavir, was used in the acute phase of COVID-19, consistent with our finding of high viral load being associated with the development of PASC32,33. PASC has been associated with the detection of different autoantibodies early in the disease course6,27,34,35,36. In our study, the presence of IFN-specific autoantibodies is associated with viral load and severity of acute COVID37. Autoantibodies neutralizing IFNα, IFNβ, and/or IFNω result in a persistent dampening of IFN responses, likely leading to insufficient viral clearance (as seen in our study in our case-control subanalysis) and tissue damage.

While numerous cytokines have been associated with PASC in other studies, in this study, we found that only fibroblast growth factor 21 (FGF21) was significantly associated with the PRO clusters38,39. FGF21 is a cytokine known to regulate systemic glucose and lipid metabolism that is secreted from muscle in response to stress40 or even infection, particularly mitochondrial myopathy41 supporting a potential catabolic role of FGF21 on human muscle health42. In our metabolomics data, we found a significant association between global metabolomics module 3 (methylhistidine metabolism) and the PRO clusters, consistent with the inverse relationship between 3-methylhistidine (3MH) and FGF21, thought to be mediated by insulin sensitivity43. This observation is also consistent with the relationship between 3-methylhistidine and muscle cells, in which 3MH is a potential biomarker for muscle atrophy and skeletal muscle toxicity. Whether the association we observed with FGF21 reflects underlying mitochondrial dysfunction as the pathobiological basis for PASC is unclear and deserves further investigation43. Elevation of plasma FGF21 has also been noted in patients with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS), an entity with clinical features that overlaps with PASC including fatigue, post-exertional malaise, sleep disturbance, and brain fog44. Interestingly, acylcarnitines also noted to be significant comparing PHY and MIN cluster, are known to have an essential role in metabolism and breaking down fatty acids for energy production45,46. In addition, acyl carnitine substrates have been investigated as surrogate biomarkers for ME/CFS47,48 and impaired metabolic health49, suggesting its possible role in reduced physical function after COVID-19 hospitalization.

Overall, participants in the MLT cluster with the most severe functional deficits during the year following hospitalization for COVID-19 had a suboptimal serological response likely driven by low circulating B cells potentially leading to high viral replication in the acute phase of the disease. In a case-control analysis, the highest percentage of autoantibodies against interferon (IFN) was noted in the MLT cluster, and the presence of these autoantibodies correlated with a high viral load. In addition, we noted an elevation in FGF21 levels in the MLT cluster similar to the COG cluster indicating possible muscle injury/stress that could explain common PASC symptoms such as fatigue and malaise. Interestingly, the COG cluster was not associated with a high viral load nor suboptimal serological response, suggesting the possibility that early immunologic parameters associated with mental and cognitive deficits following COVID-19 may be distinct from those leading to other functional limitations.

This study has several strengths, including enrollment of a diverse population from a wide variety of geographically dispersed hospitals and detailed clinical and biological phenotyping, as well as several limitations. Due to the timing of our recruitment window, very few study participants were infected with variants of concern or variants of interest or vaccinated prior to hospital admission8; thus, symptom persistence in our cohort may not be representative of patients infected with more recent emerging SARS-CoV-2 variants or with breakthrough infections50, but does provide an important characterization of post-acute disease follow-up in a virus-naïve population.

Along the same lines, certain symptoms that are now frequently linked to PASC (e.g., ‘brain fog’, sleep disturbance, dysautonomia) were not recorded as the surveys were designed prospectively early in the pandemic (March 2020) when PASC had not yet been reported. However, our use of standardized PROs that targeted cognitive, mental, and psychosocial functions enabled the identification of a specific cluster with predominant mental and cognitive deficits. Although symptoms at hospital admission were captured, pre-COVID symptomatology was not recorded, limiting estimation of the proportion of persistent symptoms directly attributable to PASC versus part of a pre-existing co-morbid condition. In addition, we did not attempt to identify alternative causes of persistent or new symptoms. However, PRO measures were chosen to attempt to mitigate this limitation by including a comparison to pre-illness baseline or some other appropriate recall period when possible. Our incidence of new onset or worsening health-related quality measures is in line with other published studies with 31% impairment in one study51, and 15.4% with poor physical component, and 32.6% with poor mental component in another study52.

Other investigators observed that certain occupations and socioeconomic status are associated with PASC15,16; we were unable to assess such associations due to a lack of occupational and socioeconomic data in our study.

Exclusion of non-hospitalized patients also affects the generalizability of our findings to patients with COVID-19 not requiring hospitalization. Also, our study did not include control groups: (1) who did not have COVID, (2) hospitalized for a non-COVID-19 respiratory viral infection, and/or (3) hospitalized for elective procedures where the length of stay is similar to COVID-19.

T-cell dysfunction has been described in PASC53, but our study did not include this assessment. We also did not fully explore other autoantibodies beyond those against type I IFNs (e.g., Ro/SS-A, La/SS-B, U1-snRNP, Jo-1, and P154 to β2 adrenoceptor, muscarinic M2 receptor, angiotensin II AT1 receptor, and angiotensin 1–7 MAS receptor)6. Similarly, EBV reactivation has been reported as potentially associated with PASC and was not investigated here.

We did not have an independent validation cohort, however, enrollment at multiple sites may have decreased selection biases.

While we largely report immunologic results observed in the analysis of blood, we also evaluated the acute immune response in the upper airway in non-intubated participants and in the lower airway in participants receiving mechanical ventilation; studies that did not detect significant associations with convalescent phenotypes. Acknowledging the broad organotropism of SARS-CoV-2, future research should explore other compartments serving as potential viral reservoirs55.

In addition to preventing and treating acute infections, there is a dire need to better understand and develop treatments for individuals with PASC. Our study represents a large multi-site prospective cohort with extensive clinical data capture and 12 months of follow-up after discharge, as well as intensive immunophenotyping efforts that employed a variety of innovative assays with rigorous data management and a standardized analysis pipeline. Our findings suggest that a functional antiviral Ab immune response contributes to viral clearance and may decrease the occurrence of PASC presenting as significant physical and multidomain deficits. Our results also highlight the benefit of measuring immune responses during the acute phase for the early identification of patients at high risk for PASC, which may facilitate testing and monitoring of targeted PASC prevention and treatment.

Methods

Ethics

NIAID staff conferred with the Department of Health and Human Services Office for Human Research Protections (OHRP) regarding the potential applicability of the public health surveillance exception [45CFR46.102(l) (2)] to the IMPACC study protocol. OHRP concurred that the study satisfied criteria for the public health surveillance exception, and the IMPACC study team sent the study protocol, and participant information sheet for review and assessment to institutional review boards (IRBs) at participating institutions. Twelve institutions elected to conduct the study as public health surveillance, while 3 sites with prior IRB-approved biobanking protocols elected to integrate and conduct IMPACC under their institutional protocols (University of Texas at Austin, IRB 2020-04-0117; University of California San Francisco, IRB 20-30497; Case Western Reserve University, IRB STUDY20200573) with informed consent requirements. Participants enrolled under the public health surveillance exclusion were provided information sheets describing the study, samples to be collected, and plans for data de-identification and use. Those who requested not to participate after reviewing the information sheet were not enrolled. In addition, participants did not receive compensation for study participation while inpatient, and subsequently were offered compensation during outpatient follow-ups.

Study design and setting

The study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines for reporting observational studies56. This study was registered at clinicaltrials.gov (NCT0438777).

Study participants

Patients 18 years and older admitted to 20 US hospitals (affiliated with 15 academic institutions) between May 2020 and March 2021 were enrolled within 72 h of hospital admission for COVID-19 infection. Only confirmed positive SARS-CoV-2 PCR and symptomatic cases attributable to COVID-19 infection were followed longitudinally57. Participants were provided compensation on an outpatient basis.

Data collection, study variables, and biologic samples

Specific data elements were acquired via a review of electronic medical records during the inpatient period8. The study was designed to enroll participants of both sexes, and sex at birth was collected based on self-report or caregiver report. Length of hospital stay, complications, mortality, and other protocol-defined outcomes were assessed over 28 days. In addition, self-reported symptoms, reinfections, SARS-CoV-2 vaccination, rehospitalizations, and standardized patient-reported outcome surveys were assessed quarterly for the duration of the study up to 12 months after initial hospital discharge.

Biologic samples collected consisted of blood and mid-turbinate nasal swabs (self, or staff collected). The timepoints were as follows: enrollment (Day 1), and Days 4, 7, 14, 21, and 28 post hospital admission (and if feasible, for discharged participants, Days 14 and 28).

Patient-reported data was collected using a comprehensive digital remote monitoring tool, in the form of a mobile application developed by My Own Med, Inc. Along with the mobile application, an administrative portal was developed to collect information by study personnel during site visits or via telephone interviews with a study coordinator to ensure real-time electronic data capture.

The surveys administered at these remote visits:58

  • Upper respiratory symptoms: sore throat, conjunctivitis/red eyes

  • Cardiopulmonary symptoms: shortness of breath (dyspnea), cough

  • Systemic symptoms: fever, chills, fatigue/malaise, muscle aches (myalgia)

  • Neurologic symptoms: loss of smell/taste (anosmia/ageusia), headache

  • Gastrointestinal symptoms: nausea/vomiting

In addition, the functional assessments of general health and the evaluation of deficits in specific health domains were conducted using validated Patient-Reported Outcome (PRO) measures, including:

  • EQ-5D-5L, a standardized, self-administered instrument that describes and quantifies health-related quality of life59

  • Patient-Reported Outcomes Measurement Information System (PROMIS). The PROMIS measures administered included:

  1. PROMIS® Item Bank v2.0 - Physical Function and PROMIS Item Bank v2.0 -Cognitive Function, two computer adaptive surveys with tailored questionnaires based on item response theory60.

  2. PROMIS Scale v1.2 - Global Health Mental 2a and PROMIS Item Bank v1.0 - Psychosocial Illness Impact—Positive - Short Form 8a, two surveys with fixed questions61,62,63,64.

  3. PROMIS Pool v1.0 - Dyspnea Time Extension computer adaptive instrument for participants who reported shortness of breath65,66,67. This 7-item questionnaire assesses whether there has been a meaningful increase or decrease in the duration of time needed by an adult to perform a given task in the past 7 days compared to 3 months ago due to shortness of breath.

For all PROMIS measures, scoring was based on PROMIS standardized instructions and conversion to a t-statistic68.

  • Health Recovery Score: Overall health was also assessed by a health recovery score utilizing a Visual Analog Scale of 1–100 to indicate overall physical and mental function compared to pre-COVID function.

All data were reviewed centrally to ensure accuracy and consistency. Any data concerns were resolved by querying the site.

The full study data collection forms for the quarterly outpatient surveys are provided in the Supplementary Information (Surveys Administered).

Assays

  • SARS-CoV-2 viral load was assessed by a central laboratory from nasal swab samples at each defined time point by RT-PCR of the viral N1, and N2 genes69(Supplementary Methods).

  • Anti-SARS-CoV-2 spike (S), and receptor binding domain (RBD) antibodies were quantitated by enzyme-linked immunosorbent assay (ELISA) in serum specimens70(Supplementary Methods).

  • Autoantibodies: blocking antibodies against type I IFNs (IFNα, IFNβ, and IFNω) were assessed in a multiplex, particle-based assay (Supplementary Methods).

  • Blood CyTOF 65 cell subsets were identified using a panel of 43 antibodies to cell surface markers expressed by distinct lineages and intracellular markers of functional status. A semi-automated gating strategy was used71 (Supplementary Methods).

  • Proximity Extension Assay (O-Link) multiplex assay inflammatory panel (Olink Bioscience, Uppsala, Sweden) includes 92 different proteins associated with human inflammatory conditions (Supplementary Methods).

  • Plasma global metabolomics was assessed using liquid chromatography-mass spectrometry technique as described (Supplementary Methods).

Statistics

Convalescent clinical outcome assessment

Overall analytic approach

Because we focused on how longitudinal patterns in PROs might define clinical phenotypes relevant to PASC, we modeled each PRO using an approach that assumes the population is composed of distinct groups, each of which follows a different underlying and unobserved trajectory, with individual-level variation around that trajectory. Our approach to the high-dimensional data problem presented by multiple PROs captured across multiple timepoints was first to preserve longitudinal patterns within each PRO using LCMMs. We then reduced further the dimensionality of the data by clustering the resulting groupings across multiple PROs using five clustering algorithms and four diagnostic statistics. This approach mirrors a recent study72 with similar aims to cluster high-dimensional data across several clinical outcomes. We compared the mean values of each PRO within each cluster to the remaining sample to interpret each cluster with respect to the contributions of each PRO. Finally, we compared clinical, demographic, and laboratory assay variables across each of the clusters thus determined. Convalescent clinical outcomes were analyzed using R Statistical Software version 4.2.1.

Latent class analysis

We considered the PROs collected at quarterly intervals and modeled longitudinally using LCMM, a family of models of which a well-known and commonly used example is the group-based trajectory model73,74, implemented by R package “lcmm”. Outcome variables were the EQ-5D-5L, global Health Recovery Score, PROMIS Cognitive Function Score, PROMIS Physical Function Score, PROMIS Dyspnea Score, PROMIS Global Mental Health Score, and PROMIS Psychosocial Illness Impact Positive Score. We evaluated linear and quadratic models with number of groupings ranging from 1 to 5, and specified the model based on convergence criteria and goodness-of-fit using Bayesian Information Criteria (BIC). For each outcome, we selected the model that converged and had the lowest BIC.

Cluster analysis

Using the assigned groups from the LCMM step for each PRO, we then applied cluster analysis to group participants with similar PRO longitudinal patterns. For those PROs with no distinct longitudinal clusters, we assigned to each participant the within-participant mean for that PRO. We calculated inter-participant similarity using Gower distance implemented by R package “CluMix”. We applied five clustering algorithms (Ward, McQuitty, Average, PAM, and Complete) to the distance matrix to identify the optimal number of clusters, and selected the optimal model based on four cluster fitting statistics (within-cluster SS, average silhouette width, Dunn index, and ratio of within-to-between SS). We then excluded cluster solutions with degenerate clusters (e.g., those with only one participant).

Thus, the best model performed well on the four fitting statistics overall and had a clinically interpretable number of clusters. We further excluded solutions with clusters of size n = 5 or smaller. We generated cluster assignments using R package “cluster” with fit statistics implemented by package “fpc”. To estimate the strength of association of each PRO with particular clusters, we calculated a t-statistic comparing the mean value of each PRO within each cluster versus the mean value of that PRO across the remaining clusters. The t-statistics were recoded such that negative values indicated a greater degree of patient-reported deficit, while positive values indicated no reported deficit.

Statistical analysis—demographic & clinical variables

We report median (interquartile range, IQR) for continuous variables and frequency (percent) for categorical variables. We examined bivariate associations between demographic and clinical factors and the PRO clusters using the Wilcoxon rank-sum test for continuous variables and chi-square test for categorical variables. Multinomial logistic regression was used to examine the adjusted associations between demographic and clinical factors and cluster membership, comparing the likelihood of being in each of the deficit clusters relative to MIN deficit cluster. P < 0.05 was considered statistically significant.

Analysis of laboratory assays

To identify modules of correlated features from high-dimensional ‘omics data, we utilized Weighted Gene Co-expression Network Analysis (WGCNA) v1.71. We specified the module “value” as the first principal component of features in the module to summarize each group of assay readouts for subsequent analysis. For interpretation, the features in each module were annotated to biological processes by performing an enrichment analysis leveraging biological knowledge bases, including MSigDB Hallmark gene sets, SMPDB metabolites and pathways. To identify the associations between different immune measurements and the four PRO clusters, we used two complementary approaches that each account for repeated measures per individual. The first was the use of generalized linear mixed effects models (GLMs) including a random effect for individual but not accounting for the timing of sample collection, the second was generalized additive mixed effects models (GAMs) that do account for timing of sample collection and allowed us to investigate longitudinal patterns. For both approaches, we utilized the measurements from samples collected within 28 days of hospital admission (with up to 6 samples per participant). In the GLM approach, we ignored the time of sample collection and identified features with different mean values from the aggregated timepoints among the PRO clusters. In the GAM approach, we investigated whether there were either differences in the average values over time or differences in the temporal patterns of features among PRO clusters. Each model is adjusted for fixed effects of participant age, sex, and random effects for participant and enrollment site. We used R packages, “lme4” for the GLM approach and “gamm4” for GAM approach. Significant associations were defined at false discovery rate (FDR) < 0.05 using the Benjamini-Hochberg method to account for multiple comparisons. For both approaches, significant features were tested by post-hoc pairwise comparisons to identify the differences between each pair of PRO clusters to facilitate interpretation. Features for which the aggregated mean values in the GLM, the average over time (referred to as intercept in the gamm4 documentation), or the shape (referred to as the smoothing term in the gamm4 documentation) differed among PRO clusters at FDR < 5% were considered significant.

Case-control analysis: anti-IFN antibodies

To determine whether IFN autoantibodies were associated with viral burden, we performed a case-control analysis, and identified age and sex-matched controls for the 24 individuals who tested positive for IFN autoantibodies with blocking activity at their earliest hospital visit (3:1 ratio of controls to cases). SARS-CoV-2 viral load (N1 Ct and N2 Ct values), SARS-CoV-2 RBD, and Spike binding IgG titers were compared between cases and controls. Significant differences in median levels between the two groups were assessed using Wilcoxon rank-sum test.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.