Long COVID manifests with T cell dysregulation, inflammation and an uncoordinated adaptive immune response to SARS-CoV-2

Yin, Kailin; Peluso, Michael J.; Luo, Xiaoyu; Thomas, Reuben; Shin, Min-Gyoung; Neidleman, Jason; Andrew, Alicer; Young, Kyrlia C.; Ma, Tongcui; Hoh, Rebecca; Anglin, Khamal; Huang, Beatrice; Argueta, Urania; Lopez, Monica; Valdivieso, Daisy; Asare, Kofi; Deveau, Tyler-Marie; Munter, Sadie E.; Ibrahim, Rania; Ständker, Ludger; Lu, Scott; Goldberg, Sarah A.; Lee, Sulggi A.; Lynch, Kara L.; Kelly, J. Daniel; Martin, Jeffrey N.; Münch, Jan; Deeks, Steven G.; Henrich, Timothy J.; Roan, Nadia R.

doi:10.1038/s41590-023-01724-6

Download PDF

Letter
Open access
Published: 11 January 2024

Long COVID manifests with T cell dysregulation, inflammation and an uncoordinated adaptive immune response to SARS-CoV-2

Nature Immunology volume 25, pages 218–225 (2024)Cite this article

59k Accesses
7 Citations
1758 Altmetric
Metrics details

Subjects

Abstract

Long COVID (LC) occurs after at least 10% of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections, yet its etiology remains poorly understood. We used ‘omic” assays and serology to deeply characterize the global and SARS-CoV-2-specific immunity in the blood of individuals with clear LC and non-LC clinical trajectories, 8 months postinfection. We found that LC individuals exhibited systemic inflammation and immune dysregulation. This was evidenced by global differences in T cell subset distribution implying ongoing immune responses, as well as by sex-specific perturbations in cytolytic subsets. LC individuals displayed increased frequencies of CD4⁺ T cells poised to migrate to inflamed tissues and exhausted SARS-CoV-2-specific CD8⁺ T cells, higher levels of SARS-CoV-2 antibodies and a mis-coordination between their SARS-CoV-2-specific T and B cell responses. Our analysis suggested an improper crosstalk between the cellular and humoral adaptive immunity in LC, which can lead to immune dysregulation, inflammation and clinical symptoms associated with this debilitating condition.

Long COVID: major findings, mechanisms and recommendations

Article 13 January 2023

Age-specific nasal epithelial responses to SARS-CoV-2 infection

Article Open access 15 April 2024

An oncolytic virus delivering tumor-irrelevant bystander T cell epitopes induces anti-tumor immunity and potentiates cancer immunotherapy

Article Open access 12 April 2024

Main

Intense efforts are underway to determine the pathophysiology of long COVID (LC), a set of conditions characterized by immune perturbations¹. T cells have important roles in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) immunity and pathogenesis^2,3,4,5,6, yet relatively little is known about their role in LC. Here we used CyTOF, serology, RNA sequencing (RNA-seq), single‐cell RNA-seq (scRNA-seq) and plasma proteomics to obtain a deep phenotypic characterization of T cells in a well-matched set of LC and fully recovered (R) individuals to identify unique immune features associated with LC that inform on the mechanistic underpinnings of this condition.

We leveraged a well-characterized cohort (Long-term Impact of Infection with Novel Coronavirus (LIINC)⁷; Supplementary Tables 1–3) to analyze the blood from 27 LC and 16 R individuals, obtained 8 months postinfection (Fig. 1a) before any SARS-CoV-2 vaccination or reinfection. LC individuals, who consistently exhibited LC symptoms such as fatigue, ‘brain fog’ and sleep disturbance over 8 months, were 63% female and included 26% previously hospitalized for COVID-19 (Extended Data Fig. 1a–c and Supplementary Tables 1–3). Comorbidities such as hypertension were more common in LC individuals (6/27 for LC and 1/16 for R), who also had higher body mass index (BMI; Extended Data Fig. 1d,e). A CyTOF panel designed to interrogate the differentiation and/or activation states, effector functions and homing properties of T cells (Extended Data Fig. 1f and Supplementary Table 4) was applied to cryopreserved blood at baseline (post-thaw) or following stimulation with SARS-CoV-2 spike and T-scan peptides (Methods) to identify SARS-CoV-2-specific T cells through intracellular cytokine staining.

**Fig. 1: CD4⁺ T cell phenotypes are perturbed in individuals with LC.**

Both baseline and poststimulation datasets were gated on CD3⁺ events to identify T cells (Extended Data Fig. 1g,h), which were assessed for the expression of a panel of effector molecules, consisting of the cytokines interferon-γ (IFN-γ), tumor necrosis factor (TNF), interleukin (IL)-2, IL-4, IL-6, IL-17 and CCL4, and the cytolytic markers granzyme B and perforin (Extended Data Fig. 2a,b). Based on criteria comparing stimulated versus baseline samples (Methods), IFN-γ, TNF and/or IL-2 positivity identified SARS-CoV-2-specific CD4⁺ T cells, whereas IFN-γ, TNF and/or CCL4 positivity identified SARS-CoV-2-specific CD8⁺ T cells (Fig. 1b,c and Extended Data Fig. 2a,b). Using Boolean gating, we did not find significant differences between the frequencies of total SARS-CoV-2-specific CD4⁺ or CD8⁺ T cells (Fig. 1d), or those producing individual effector cytokines IFN-γ, TNF, IL-2 or CCL4 (Extended Data Fig. 2c,d) between LC and R individuals. Furthermore, the distribution of polyfunctional (producing at least two cytokines) SARS-CoV-2-specific CD4⁺ and CD8⁺ T cells was similar between LC and R individuals (Fig. 1e, f). However, SARS-CoV-2-specific IFN-γ⁺TNF⁺IL-2⁺CD4⁺ T cells and SARS-CoV-2-specific IFN-γ⁺TNF⁺CCL4⁺CD8⁺ T cells were more abundant, without reaching statistical significance, in R individuals (Fig. 1e,f). IL-6 expression in CD4⁺ T cells was induced exclusively in those with LC, albeit only in a small subset (14%; Extended Data Fig. 2e,f).

CD45RA⁺CD45RO⁻CCR7⁺CD95⁻ naïve T (T_N) cells, CD45RA⁺CD45RO⁻CCR7⁺CD95⁺ stem cell memory T cells (T_SCM) cells, CD45RA⁻CD45RO⁺CCR7⁺CD27⁺ central memory T cells (T_CM) cells, CD45RA⁻CD45RO⁺CCR7⁻CD27⁻ effector memory T (T_EM) cells, CD45RA⁻CD45RO⁺CCR7⁻CD27⁺ transitional memory T (T_TM) cells and CD45RA⁺CD45RO⁻CCR7⁻ effector memory RA T (T_EMRA) cells were identified in both CD4⁺ and CD8⁺ T cell compartments through manual gating (Extended Data Fig. 1i,j). In addition, CD45RA⁻CD45RO⁺CD127⁻CD25⁺ T regulatory (T_reg) cells and CD45RA⁻CD45RO⁺PD1⁺CXCR5⁺ peripheral T follicular helper (pT_FH) cells were identified in the CD4⁺ T cell compartment, and we additionally established a more stringent CD45RA-CD45RO⁺PD1^hiCXCR5^hi T_FH cell gate (Extended Data Fig. 1i). Total CD4⁺ T_CM, pT_FH, T_FH and T_reg cell subsets were more frequent in LC compared to R individuals with no difference between LC and R in the other total CD4⁺ T cell subsets analyzed (Fig. 1g), while none of these subsets were significantly different between LC and R when examining SARS-CoV-2-specific CD4⁺ T cells (Fig. 1g,h). All analyzed subsets of total or SARS-CoV-2-specific CD8⁺ T cells were statistically similar between LC and R individuals (Extended Data Fig. 3).

Analysis of expression levels of all CyTOF markers in total or SARS-CoV-2-specific CD4⁺ or CD8⁺ T cells found that no markers were significantly differentially expressed between LC and R individuals (Extended Data Figs. 4 and 5). We found no significant differences in the percentages of CD4⁺ or CD8⁺ T cells expressing the acute activation markers CD38, HLA-DR and/or Ki67 in LC compared to R individuals (Extended Data Fig. 6). Clustering analyses (Methods) revealed CD4⁺ T cells fell into six clusters (A1–A6) and CD8⁺ T cells into five clusters (B1–B5) clusters that did not differ significantly between LC and R individuals (Extended Data Fig. 7a,e). However, cluster A1 was significantly underrepresented in LC compared to R females, but not in males, while cluster A4 was significantly underrepresented in LC compared to R males, but not in females (Extended Data Fig. 7b). Cluster A1 was composed of CD45RO^loCD45RA^hiCD4⁺ T_N cells and expressed low levels of activation markers (HLA-DR and Ox40) and inflammatory tissue-homing receptors (CD29 and CXCR4), as well as high levels of lymph node homing receptors (CD62L and CCR7; Extended Data Fig. 7c). Cluster A4 was composed of terminally differentiated CD45RO^hiCD27^loCD57^hi CD4⁺ T_EM cells and expressed high levels of receptors associated with homing to inflamed tissues (CD29, CXCR4 and CCR5) but not to lymph nodes (CD62L and CCR7). They also had high expression of cytolytic markers perforin and granzyme B (Extended Data Fig. 7d). Among CD8⁺ T cells, cluster B1 was significantly underrepresented in LC females, while cluster B2 was significantly overrepresented in LC females, compared to their R female counterparts, with no differences observed in males (Extended Data Fig. 7f). Cluster B1 comprised CD8⁺ T cells expressing markers of cluster A1 (CD45RO^loCD45RA^hiHLA-DR^loOx40^loCD29^loCXCR4^loCD62L^hiCCR7^hi), whereas cluster B2 comprised CD8⁺ T cells expressing markers of cluster A4 (CD27^loCD57^hiCD29^hiCXCR4^hiCCR5^hiCD62L^loCCR7^lo). These observations suggested that females with LC had relatively low frequencies of resting CD4⁺ and CD8⁺ T_N cells, which expressed low levels of inflammatory tissue-homing receptors, and high frequencies of terminally differentiated CD4⁺ and CD8⁺ T_EM cells, which expressed inflammatory tissue-homing receptors and cytolytic markers.

The t-distributed stochastic neighbor embedding (t-SNE) visualization of SARS-CoV-2-specific CD4⁺ T cells indicated that those from LC and R individuals tended to concentrate in different areas (Fig. 2a). The tissue-homing receptors CXCR4, CXCR5 and CCR6 were expressed higher on SARS-CoV-2-specific CD4⁺ T cells from LC as compared to R individuals (Fig. 2b). Manual gating showed that the percentages of SARS-CoV-2-specific CXCR4⁺CXCR5⁺CD4⁺ T cells and CXCR5⁺CCR6⁺CD4⁺ T cells were significantly increased, and CXCR4⁺CCR6⁺CD4⁺ T cells showed a trend toward higher percentages, in LC compared to R individuals (Fig. 2c). Higher percentages of total CXCR4⁺CXCR5⁺CD4⁺ T cells and CXCR5⁺CCR6⁺CD4⁺ T cells were found in LC compared to R as well (Fig. 2d). Flow cytometric analysis of the same LC and R specimens found statistically significant elevated frequencies of CXCR4⁺CXCR5⁺CD4⁺, CXCR5⁺CCR6⁺CD4⁺ and CXCR4⁺CCR6⁺CD4⁺ T cells in LC compared to R (Extended Data Fig. 8a–c). Expression of CXCR5 is common among the CXCR4⁺CXCR5⁺CD4⁺ T cell, CXCR5⁺CCR6⁺CD4⁺ T cell and pT_FH cell subsets, and we observed significant positive associations between the percentages of pT_FH cells and other CXCR5⁺CD4⁺ T cells, particularly in the LC group (Fig. 2e,f).

**Fig. 2: SARS-CoV-2-specific CD4⁺ T cells from individuals with LC preferentially express homing receptors associated with migration to inflamed tissues.**

SARS-CoV-2-specific CD8⁺ T cells were also globally different between LC and R (Fig. 3a), and those from the individuals with LC preferentially expressed the checkpoint markers PD1 and CTLA4, but not TIGIT (Fig. 3b). Consistently, SARS-CoV-2-specific PD1⁺CTLA4⁺CD8⁺ T cells were significantly elevated in LC compared to R individuals, while SARS-CoV-2-specific TIGIT⁺CTLA4⁺CD8⁺ or PD1⁺TIGIT⁺CD8⁺ T cells were not (Fig. 3c). However, the frequencies of total PD1⁺CTLA4⁺CD8⁺ T cells were similar in the LC and R groups (Fig. 3d and Extended Data Fig. 8d).

**Fig. 3: SARS-CoV-2-specific CD8⁺ T cells from individuals with LC preferentially express the exhaustion markers PD1 and CTLA4.**

Serological analysis indicated significantly higher (2.3×) total receptor binding domain (RBD)-specific antibody titers in LC as compared to R individuals (Fig. 4a). LC individuals with the highest frequencies of SARS-CoV-2-specific PD1⁺CTLA4⁺CD8⁺ T cells had near undetectable antibody levels (Fig. 4b). LC individuals with the highest frequencies of SARS-CoV-2-specific PD1⁺CTLA4⁺CD8⁺ T cells had the lowest frequencies of SARS-CoV-2-specific CD4⁺ T_reg cells, and the frequencies of these two subsets of cells negatively correlated in LC, but not R individuals (Fig. 4b). A significant positive correlation between RBD-specific titers and total SARS-CoV-2-specific total CD4⁺ and CD8⁺ T cell frequencies was detected in R but not LC individuals (Fig. 4c). The frequencies of SARS-CoV-2-specific pT_FH cells also correlated positively with RBD-specific antibody titers in R but not LC individuals (Fig. 4c), suggesting a mis-coordinated humoral and cell-mediated response, previously implicated in severe COVID-19 (ref. ⁸), may also be a hallmark of LC.

**Fig. 4: Humoral and cellular immunity are discoordinated in individuals with LC.**

Bulk RNA-seq identified only two genes, OR7D2 and ALAS2, that were significantly differentially expressed between LC and R. OR7D2 encodes a G-protein-coupled receptor that is activated by odorant molecules, whereas ALAS2 encodes an enzyme that catalyzes the first step in heme synthesis to generate δ-aminolevulinic acid from succinyl-CoA and glycine. Both OR7D2 and ALAS2 were overexpressed in LC individuals although not necessarily together, as the four individuals with the highest OR7D2 expression in peripheral blood mononuclear cells (PBMCs) did not have the highest ALAS2 expression (Fig. 5a). Supervised clustering found upregulation of a module of genes that regulate heme synthesis and carbon dioxide transport (ALAS2, HBB, CA1, HBA1, SLC4A1, HBD and HBA2) and the downregulation of a module consisting of immunoglobin kappa, lambda and heavy chain genes in LC compared to R individuals (Fig. 5b,c), suggesting the involvement of heme biosynthesis and immune dysregulation in LC.

**Fig. 5: Global changes in gene and gene product expression in the blood of individuals with LC.**

To gain a more granular view of the transcriptome, we selected a subset of the specimens analyzed by bulk RNA-seq for repeat analysis by scRNA-seq. We limited these studies to females because individuals with high levels of OR7D2 or ALAS2 were mostly female (the top five OR7D2 expressors were female, as were five of the top six ALAS2 expressors). For comparison, we included four randomly selected females from the R specimens. Integration of data from all 12 samples identified 11 clusters of cells and revealed that the granulocyte cluster was significantly less abundant (P = 0.006) and the platelet cluster more abundant (P = 0.01) in LC compared to R individuals, while the other clusters (CD4⁺ T cells, CD8⁺ T cells, CTLs, B cells, monocytes and NKT/NK/MAIT/γδ T cells) did not differ between the groups (Fig. 5d). Visualization based on LC versus R status, or based on OR7D2^hi LC versus ALAS2^hi LC, did not reveal profound differences (Extended Data Fig. 9a,b). Among all cells, OR7D2 expression was highest in cells of the OR7D2^hi LC group and ALAS2 was highest in cells of the ALAS2^hi LC group, and all clusters except granulocytes and platelets expressed OR7D2 and ALAS2 (Extended Data Fig. 9c–e).

Interrogation of cluster-specific gene expression identified three additional genes (THEMIS, NUDT2 and PPIE) that were differentially expressed (P < 0.05) in LC individuals, two within CD8⁺ T cell cluster 1 and one within monocyte cluster 3 (Fig. 5e). Using a less stringent cutoff (P < 0.1), we found 16 differentially expressed genes (DEGs) within CD8⁺ T cell cluster 1 (for example, THEMIS, HMGB2 and TNFRSF18), monocyte cluster 3 (PPIE) and CD4⁺ T cell cluster 7 (for example, CAST and APBA2; Fig. 5f and Supplementary Table 5). Gene Ontology (GO) pathway analysis found significant (P < 0.05) differences between LC and R individuals within monocyte cluster 3, in pathways associated with transcriptional regulation and splicing, protein regulation and neutrophil degranulation (Supplementary Table 6). Trends (P < 0.1) were observed for pathways associated with apoptosis and metabolism and/or oxidative stress in CD8⁺ T cell cluster 1 (Supplementary Table 7). CXCR4, CXCR5 and CCR6 were upregulated in CD4⁺ T cell clusters 0 and 7 from LC compared to their counterpart clusters in R (Extended Data Fig. 9f). Comparison of OR7D2^hi LC versus R revealed 35 DEGs in the OR7D2^hi LC group (Extended Data Fig. 9g and Supplementary Table 8) including upregulation of the histone family genes HIST1H2AM, HIST2H2AC and HIST1H1E, while comparison of ALAS2^hi LC versus R revealed 14 DEGs including upregulation of THEMIS and downregulation of BACH2 (P < 0.05; Extended Data Fig. 9h and Supplementary Table 9). GO pathways associated with the OR7D2^hi LC DEGs included lipid transport and stress responses in CD4⁺ T cell cluster 7, RNA splicing in CD8⁺ T cell cluster 5 and immunoglobulin (Ig) production in B cell cluster 8 (Supplementary Table 10), while those associated with the ALAS2^hi LC DEGs included apoptosis and oxidative stress responses in CD8⁺ T cell cluster 1 (Supplementary Table 11).

Olink proteomics indicated elevated expression of proteins associated with inflammation (LGALS9, CCL21, CCL22, TNF, CXCL10 and CD48) and immune regulation (IL1RN and CD22) in LC compared to R individuals (Fig. 5g). LC individuals had elevated expression of IL-4 and decreased expression of IL-5 compared to R individuals (Fig. 5g,h), although both cytokines are associated with T helper 2 (T_H2) cell responses. CCL22, a ligand for the T_H2 cell marker CCR4, was expressed at elevated levels in LC compared to R individuals (Fig. 5h). IL-4, but not IL-5 or CCL22, significantly positively associated with the percentages of total CXCR4⁺CXCR5⁺CD4⁺ and CXCR5⁺CCR6⁺CD4⁺ T cells in LC individuals (Extended Data Fig. 8e), suggesting an elevated, yet mis-coordinated, T_H2 cell response during LC.

In summary, using multiple ‘omics’ analytical approaches, we found that LC individuals exhibited phenotypic perturbations in both total and SARS-CoV-2-specific CD4⁺ and CD8⁺ T cells and changes in gene expression among CD4⁺ T cells, CD8⁺ T cells, monocytes and B cells. We found higher proportions of CD4⁺ T_CM cells, T_FH cells and T_reg cells in LC compared to R individuals. SARS-CoV-2-specific CD8⁺ T cells, but not total CD8⁺ T cells, more frequently expressed the exhaustion markers PD1 and CTLA4, consistent with ongoing stimulation by viral antigens. Further supporting a potential persistent reservoir was our observation of higher SARS-CoV-2 antibody levels in LC individuals, consistent with reports of higher spike-specific IgG in LC compared to R individuals⁹. CyTOF, flow cytometry and scRNA-seq indicated that CD4⁺ T cells from LC individuals preferentially expressed CXCR4, CXCR5 and CCR6. CXCR4 expression is elevated on bystander CD4⁺ and CD8⁺ T cells in fatal COVID-19 (ref. ⁴) and on pulmonary CD4⁺ T cells, B cells, macrophages and granulocytes in the context of LC following SARS-CoV-2 infection of mice¹⁰. Although fully recovered individuals exhibited coordinated humoral and cellular immune responses to SARS-CoV-2, this coordination was lost in LC individuals, consistent with observations that about half of individuals with LC with no detectable SARS-CoV-2 antibodies have detectable SARS-CoV-2-specific T cell responses¹¹. How the humoral response becomes divorced from the cellular response is unclear, but could involve a misalignment between IL-4 and IL-5 production by T_H2 cells, as indicated by our Olink analysis.

Our study has limitations. First, the cohort analyzed included only 43 participants; however, the rigor with which participants were characterized mitigates the limitations of the small sample size. Some findings were driven by small subsets of LC individuals, which is consistent with the notion of LC being a heterogeneous disease, and will require validation in larger cohorts. Second, due to limited channels available for CyTOF, we did not examine additional markers that would have been of interest such as the exhaustion marker thymocyte selection-associated high mobility group protein (TOX)¹², the activation marker CD40L and the proliferation marker 5-Iodo-2'-deoxyuridine (IdU)¹³. Third, the changes we saw in the blood subsets could reflect migration to tissues. Finally, our study was for the most part descriptive. However, for new and poorly understood diseases, in-depth ‘omics’-based characterization of a well-annotated cohort is the critical first step for better understanding the condition’s etiology and mechanistic underpinnings.

Methods

Study participants

Participants were enrolled in LIINC (www.liincstudy.org; NCT04362150)⁷, a prospective observational study enrolling individuals with prior nucleic acid-confirmed SARS-CoV-2 infection, regardless of the presence or absence of postacute symptoms. At each study visit, participants underwent an interviewer-administered assessment of 32 physical symptoms that were newly developed or had worsened since the COVID-19 diagnosis. Detailed data regarding medical history, COVID-19 history, SARS-CoV-2 vaccination and SARS-CoV-2 reinfection were collected. Two participants had biospecimens collected via the COVID-19 Host Immune Response Pathogenesis (CHIRP) study⁵. For the present study, we selected participants who consistently met a case definition for LC based on the presence or absence of at least one symptom attributable to COVID-19 for the 8 months following SARS-CoV-2 infection (Fig. 1a). The LC group (n = 27) had a median age of 46 years, and was comprised of 63% females and 26% of whom were previously hospitalized for COVID-19. The R group (n = 16) had a median age of 45.5 years, and was comprised of 44% females and 12.5% of whom were previously hospitalized for COVID-19 (Supplementary Table 1). Participants were deliberately not matched by age and sex, but we ensured that there was overlap in the groups. Blood samples were collected between September 16, 2020 and April 6, 2021. All participants provided a post-COVID blood sample before a SARS-CoV-2 vaccination to exclude the potential effects of SARS-CoV-2 vaccination on our study. Specimens were collected 8 months postinfection from individuals. All assays were performed from the same parent set of n = 27 LC and n = 16 specimens. All participants provided written informed consent.

Biospecimen collection

Whole blood was collected in EDTA tubes followed by isolation of PBMCs and plasma as described in ref. ¹⁵. Serum was obtained concomitantly from serum-separator tubes.

Serology

Antibody responses against SARS-CoV-2 spike RBD were measured on sera using the Pylon COVID-19 total antibody assay (ET Health) and reported as relative fluorescence units (RFUs).

SARS-CoV-2 peptides

Peptides used for T cell stimulation comprised a mix of overlapping 15-mers spanning the entire SARS-CoV-2 spike protein (PM-WCPV-S-1, purchased from JPT), and peptides corresponding to CD8⁺ T cell epitopes identified by T-scan¹⁶ synthesized in-house (Supplementary Table 12). Final peptide concentrations were 300 nM for the 15-mers and 450 nM for the T-scan peptides.

CyTOF

Sample preparation was performed similar to methods described^2,3,4,5. Upon revival of cryopreserved PBMCs, cells were rested overnight to allow for antigen recovery¹⁷ and then divided equally into two aliquots. To the first aliquot, we added 3 µg ml⁻¹ brefeldin A (BFA; to enable intracellular cytokine detection), the costimulation agonists anti-CD28 (2 µg ml⁻¹; BD Biosciences) and anti-CD49d (1 µg ml⁻¹; BD Biosciences), and the SARS-CoV-2 peptide pool prepared as described above. To the second aliquot, we added 1% DMSO (Sigma-Aldrich) and 3 µg ml⁻¹ BFA. Cells from both treatments were incubated at 37°C for 6 h. Cells were treated with cisplatin (Sigma-Aldrich) as a live/dead distinguisher and fixed in paraformaldehyde (Electron Microscopy Sciences) as described^2,3,4,5. CyTOF antibody conjugation was performed using the Maxpar X8 Antibody Labeling Kit (Standard BioTools) according to the manufacturer’s instructions. CyTOF staining was performed as described^2,3,4,5, but using the CyTOF panel created for this study (Supplementary Table 4). Stained samples were washed with CAS buffer (Standard BioTools), spiked with 10% (vol/vol) EQ Four Element Calibration Beads (Standard BioTools) and run on a Helios CyTOF instrument (UCSF Parnassus Flow Core).

CyTOF data analyses

Data preprocessing

EQ bead-normalized CyTOF datasets were concatenated, de-barcoded and normalized using Standard BioTools Software version 6.7. Following arcsinh transformation of the data¹⁸, cells were analyzed by FlowJo (version 10.8.1, BD Biosciences). Intact (Ir191⁺Ir193⁺), live (Pt195⁻), singlet events were identified, followed by gating on CD3⁺ T cells, and sub-gating on CD4⁺ T cells and CD8⁺ T cells (Extended Data Fig. 1g,h).

CyTOF antibody validation

CyTOF antibodies in our panel (Supplementary Table 4) were validated using methods previously described, including the use of human lymphoid aggregate cultures generated from tonsils^{2,3,4,5,18,19}. The observed expression patterns among tonsillar T and B cells (Extended Data Fig. 10a) were similar to those previously observed¹⁸. To validate the detection of cytokines and other effectors, we stimulated PBMCs with 16 nM phorbol 12-myristate 13-acetate (PMA) (Sigma-Aldrich) and 1 μM ionomycin (Sigma-Aldrich), or 1 μg ml⁻¹ lipopolysaccharides (LPS; eBioscience), for 4 h in the presence of 3 μg ml⁻¹ BFA solution (eBioscience), combined the cells and prepared them for CyTOF as described above. We observed the expected induction of cytokines or cytolytic markers (Extended Data Fig. 10b)^2,3,4,5 and preferential expression of T_reg lineage marker Foxp3 among CD3⁺CD4⁺CD45RO⁺CD45RA⁻CD127⁻CD25⁺ T_reg cells (Extended Data Fig. 10c). We also observed preferential expression of CD30 and Ki67 in CD4⁺ T_M as compared to CD4⁺ T_N cells (Extended Data Fig. 10d). Examples of pT_FH and T_FH gates are depicted in Extended Data Fig. 10e.

Identification of SARS-CoV-2-specific T cells

For identification of SARS-CoV-2-specific T cells, we compared unstimulated specimens to their peptide-stimulated counterparts. Effector cytokines (IFN-γ, TNF, IL-2, IL-4, IL-6, IL-17 and CCL4) and cytolytic effectors (granzyme B and perforin) were assessed for the ability to identify antigen-specific T cells at the single-cell level. The following criteria were established to identify effector molecules appropriate for identifying SARS-CoV-2-specific T cells: (1) counts of positive cells in unstimulated sample (not receiving peptide) was less than 5 events, or the frequency of positive cells was lower than 0.1%; (2) counts of positive cells in the peptide-stimulated sample was not less than 5, or the frequency was higher than 0.1%; (3) differences in frequencies of positive cells between unstimulated and peptide-stimulated samples cells was not less than 0.01%; (4) fold change in frequencies of positive cells between unstimulated and peptide-stimulated samples cells was greater than 10 and (5) the aforementioned four criteria could identify SARS-CoV-2-specific T cells among >50% of participants. Effectors that fulfilled all five criteria were IFN-γ, TNF and IL-2 for CD4⁺ T cells and IFN-γ, TNF and CCL4 for CD8⁺ T cells. For a sub-analysis to identify responding cells that may only exist in a small subset of individuals, we removed criterion 5 and reduced the positive cell counts to number 3 within criteria 1 and 2. This approach allowed us to determine that SARS-CoV-2-specific CD4⁺ T cells producing IL-6 were exclusively detected from LC (Extended Data Fig. 2f). SARS-CoV-2-specific T cells were detected at a median of 163 cells (134 for CD4⁺ T cells and 29 for CD8⁺ T cells) and a mean of 221.7 cells (185.2 for CD4⁺ T cells and 36.4 for CD8⁺ T cells), per participant. SARS-CoV-2-specific T cells, once identified, were analyzed by Boolean gating²⁰ and exported for further analyses.

SPICE

SPICE analyses were performed using version 6.1 software²¹. CD4⁺ and CD8⁺ T cells were subjected to manual gating based on the expression of cytokines used to define SARS-CoV-2-specific T cells (IFN-γ, TNF, IL-2 and CCL4, see above) using operations of Boolean logic. The parameters for running the dataset were as follows: iterations for permutation test = 10,000 and highlight values = 0.05. The parameters for the query structure were set as follows: values = frequency of single cytokine positive cells in total CD4⁺/CD8⁺ T cells; category = IFN-γ, TNF, IL-2 and CCL4; overlay = patient type (LC versus non-LC); group = all other variables in the data matrix.

T cell subsetting

Manual gating was performed using R (version 4.1.3). Arcsinh-transformed data corresponding to total or SARS-CoV-2-specific CD4⁺ or CD8⁺ T cells were plotted as 2D plots using the CytoExploreR package. Visualization of datasets by t-SNE was performed using methods similar to those described^2,3,4,5. CytoExploreR and tidyr packages were used to load the data, and t-SNE was performed using Rtsne and RColorBrewer packages on arcsinh-transformed markers. Total CD4⁺/CD8⁺ T cells were downsampled to n = 8,000 (maximal cell number for individual samples) before t-SNE analysis. The parameters for t-SNE were set as iteration = 1,000, perplexity = 30 and θ = 0.5.

T cell clustering analysis

Flow cytometry standard (FCS) files corresponding to total and SARS-CoV-2-specific CD4⁺ and CD8⁺ T cells were imported in R for data transformation. Packages of flowcore, expss, class and openxlsx were loaded in R. Arcsinh-transformed data were then exported as CSV files for clustering analyses. Biological (LC status, biological sex and hospitalization status) and technical (batch/run of processing) variables were visualized using the DimPlot function of Seurat²². Batch correction was performed by RunHarmony²³. Optimal clustering resolution parameters were determined using Random Forests²⁴ and a silhouette score-based assessment of clustering validity and subject-wise cross-validation, as detailed in ref. ²⁵. A generalized linear mixed model (GLMM, implemented in the lme4 (ref. ²⁶) package in R with family argument set to the binomial probability distribution) was used to estimate the association between cluster membership and LC status and the sex of the participant, with the participant modeled as a random effect. For each individual, cluster membership of cells was encoded as a pair of numbers representing the number of cells in the cluster and the number of cells not in the cluster. Clusters having fewer than three cells were discarded. The sex-specific log odds ratio of cluster membership association with LC status was estimated using the emmeans²⁷ R package using the GLMM model fit. The estimated log odds ratio represented the change (due to LC status) in the average over all participants of a given sex in the log odds of cluster membership. The two-sided P values corresponding to the null hypothesis of an odds ratio value of 1 were computed based on a z statistic in the GLMM model fit. These P values were adjusted for multiple testing using the Benjamini–Hochberg method.

Flow cytometry

Flow cytometry was performed on PBMCs from 25 LC and 15 R individuals from our cohort, obtained from aliquots of specimens analyzed by CyTOF. Cells were stained with the panel shown in Supplementary Table 13, using Zombie UV or Zombie NIR (BioLegend) as viability indicators. All cells were analyzed on a Fortessa X-20 (BD Biosciences). FCS files were exported into FlowJo (BD, version 10.9.0) for further analysis. Flow cytometric data were arcsinh-scaled before analyses. In flow cytometric experiments, SARS-CoV-2-specific CD8⁺ T cells were defined as those specifically inducing IFN-γ and/or TNF in response to SARS-CoV-2 peptide stimulation, as the CCL4 antibody exhibited background staining in flow cytometry and could not be used to define SARS-CoV-2-specific T cells.

RNA-seq

RNA-seq was performed on PBMCs from 23 LC and 13 R individuals from our cohort, obtained from aliquots of specimens analyzed by CyTOF. Samples were prepared using the AllPrep kit (Qiagen) per the manufacturer’s instructions. RNA libraries, next-generation Illumina sequencing, quality control analysis, trimming and alignment were performed by Genewiz (Azenta). Briefly, following oligo dT enrichment, fragmentation and random priming, cDNA syntheses were completed. End repair, 5′ phosphorylation and dA-tailing were performed, followed by adaptor ligation, PCR enrichment and sequencing on an Illumina HiSeq platform using PE150 (paired-end sequencing, 150 bp for reads 1 and 2). Raw reads (480 Gb in total) were trimmed using Trimmomatic (version 0.36) to remove adapter sequences and poor-quality reads. Trimmed reads were mapped to Homo sapiens GRCh37 using star aligner (version 2.5.2b)²⁸. log₂ fold changes were calculated between LC versus R individuals. Two-sided P values corresponding to a null hypothesis of fold change of 1 were calculated using DESeq2’s (ref.²⁹) Wald test and were adjusted for multiple testing using false discovery rates. Genes with an adjusted P value < 0.05 and absolute log₂(fold change) > 1 were considered significant DEGs. Clustered heatmaps of DEGs were constructed with groups of genes (rows) defined using the k-means algorithm to cluster genes into k clusters based on their similarity. K = 4 was determined using the Hierarchical Ordered Partitioning and Collapsing Hybrid (HOPACH) algorithm³⁰, which recursively partitions a hierarchical tree while ordering and collapsing clusters at each level to identify the level of the tree with maximally homogeneous clusters.

scRNA-seq

scRNA-seq was performed on PBMCs from 8 LC and 4 R individuals from our cohort, obtained from aliquots of specimens analyzed by CyTOF. Library preparation was performed using the Chromium Next GEM Single-Cell 5′ Reagent Kits v2 (10x Genomics) and sequenced on the Illumina NovaSeq 6000 S4 300 platform. Samples were sequenced at a mean of >50k reads per cell (minimum 51k, maximum 120k and median 83k). A median of 7,888 cells was analyzed per donor (minimum 4,189 and maximum 9,511). Demultiplexed fastq files were aligned to human reference genome GRCh38 using the 10x Genomics Cell Ranger v7.1.0 count pipeline³¹. The include-introns flag for the count pipeline was set to true to count reads mapping to intronic regions. The filtered count matrices generated by the Cell Ranger count pipeline were processed using Seurat²². Each sample was preprocessed as a Seurat object, and the top 1% of cells per sample with the highest numbers of unique genes, cells with ≤200 unique genes and cells ≥10% mitochondrial genes were filtered out for each sample. The samples were then merged into a single Seurat object, and normalization and variance stabilization were performed using sctransform86 with the ‘glmGamPoi’ method³² for initial parameter estimation.

Graph-based clustering was performed using the Seurat²² functions FindNeighbors and FindClusters. First, the cells were embedded in a k-nearest neighbor graph (with k = 20) based on the Euclidean distance in the principal component analysis (PCA) space. The edge weights between the two cells were further modified using Jaccard similarity. Next, clustering was performed using the Louvain algorithm³³ implementation in the FindClusters Seurat function. Clustering with 15 principal components (PCs, determined based on the location of the elbow in the plot of variance explained by each of the top 25 PCs) and 0.1 resolution (determined using the resolution optimization method described above for CyTOF data clustering) resulted in 11 distinct biologically relevant clusters (clusters 0–11), which were used for further analyses. Marker genes for each cluster were identified using the FindAllMarkers Seurat function. Marker genes were filtered to keep only expressed genes detected in at least 25% of the cells, with at least 0.5 log₂ fold change. Cluster annotation was performed according to subset definitions previously established^34,35,36. Classification markers included CD19, MS4A1 and CD79A for B cells; CD3D, CD3E, CD5 and IL7R for CD4⁺ T cells; CD3D, CD3E, CD8A, CD8B and GZMK (CTL subset) for CD8⁺ T cells; CD14, CD68, CYBB, S100A8, S100A9, S100A12 and LYZ for monocytes; CSF2RA, LYZ, CXCL8 and CD63 for granulocytes and PF4, CAVIN2, PPBP, GNG11 and CLU for platelets.

The counts-per-million reads for ALAS2 and OR7D2 were assessed using edgeR³⁷, and associations with group status were made using the two-sample Welch t test, followed by multiple correction testing using the Holm³⁸ procedure. For establishing associations between clusters and group status, GLMM implemented in the lme4 R package was used. The model was performed with the family argument set to the binomial probability distribution and with the ‘nAGQ’ parameter set to 10 corresponding to the number of points per axis for evaluating the adaptive Gauss–Hermite approximation for the log-likelihood estimation. Cluster membership was modeled as a response variable by a two-dimensional vector representing the number of cells from a given sample belonging or not to the cluster under consideration. The corresponding sample from which the cell was derived was the random effect variable, and the group (R, LC, OR7D2^high LC, or ALAS2^high LC) was considered the fixed variable. The log odds ratio for all pairwise comparisons was estimated using the model fits provided to the emmeans function in the emmeans R package²⁷. The resulting P values for the estimated log odds ratio and clusters were adjusted for multiple testing using the Benjamini–Hochberg method³⁹. For associations of gene expression with group status, raw gene counts per cell were loaded as a SingleCellExperiment object. Cells from clusters 9 and 10 were not included in this analysis as the median number of cells across samples was less than 20 per cluster. The aggregateData function in the muscat bioconductor package⁴⁰ was used to pseudo-bulk the gene read counts across cells for each cluster group. Genes with raw counts less than ten in more than eight samples were removed from the analyses. The pbDS function implementing the statistical methods in the edgeR package³⁷ was used to assess associations of gene expression with group identity. Results from the cluster-specific pseudo-bulked gene expression association analyses were visualized as volcano plots using EnhancedVolcano^41,42. Select genes of interest or genes that passed a multiple testing-adjusted P value threshold of 0.05 or 0.1 as indicated were indicated in the volcano plots. For gene set enrichment analyses, the raw P values for each gene derived from hypothesis tests for associations of interest were combined with a list of genes annotated with each of the gene sets in the biological processes domain of GO⁴³ and analyzed via the simultaneous enrichment analysis method⁴⁴ using the rSEA R package⁴⁵. The family-wise error rate-adjusted P values for cluster-specific associations of interest with each of the annotated gene sets were used to identify significant associations.

Olink

The Olink EXPLORE 384 inflammation protein extension assay was performed per manufacturer’s protocol as published in ref. ⁴⁶.

Data visualization

HOPACH³⁰ was used to find the best cluster number. Gene expression values were log-transformed and centered using the average expression value. Clustering was performed by running the k-means algorithm using the best cluster number k found, and the results were plotted using the pheatmap package⁴⁷. For gene network analyses, the STRING interaction database was used to reconstruct gene networks using stringApp⁴⁸ for Cytoscape⁴⁹. For the network, the top 50 genes or 25 proteins with the lowest P values were selected from the RNA-seq data and Olink data, respectively. They were then subjected to stringApp with an interaction score cutoff = 0.5 and the number of maximum additional indirect interactors cutoff = 10.

Statistical tests

Unless otherwise indicated, permutation tests, two-tailed unpaired Student’s t tests and Welch’s t test were used for statistical analyses. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 and NS. Error bars corresponded to s.d. Graphs were plotted by GraphPad Prism (version 9.4.1). All measurements were taken from distinct samples, no samples were measured repeatedly to generate data. Where appropriate, P values were corrected for multiple testing (across three pairwise comparisons) using the Holm procedure³⁸. Tests involving cluster membership differences assumed a binomial probability distribution, and those involving RNA expression differences assumed a negative binomial probability distribution, but these were not formally tested. All other tests were based on the normality assumption but this was not formally tested.

Statistics and reproducibility

No statistical method was used to predetermine the sample size. Samples were chosen based on the availability of specimens meeting our LC criteria. No samples were excluded from the analyses. Randomization was not implemented as the study compared LC to R individuals. Data collection and analysis were not performed blind to the conditions of the experiments.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The raw CyTOF datasets for this study corresponding to total and SARS-CoV-2-specific CD4⁺ and CD8⁺ T cells are publicly accessible through the following link: https://datadryad.org/stash/dataset/doi:10.7272/Q6WD3XTB. The raw Olink data are also downloadable through this link. The raw bulk RNA-seq and scRNA-seq data from this study are deposited in the Gene Expression Omnibus database—GSE224615 (for bulk RNA-seq) and GSE235050 (for scRNA-seq).

References

Davis, H. E., McCorkell, L., Vogel, J. M. & Topol, E. J. Long COVID: major findings, mechanisms and recommendations. Nat. Rev. Microbiol. 21, 133–146 (2023).
Article CAS PubMed PubMed Central Google Scholar
Ma, T. et al. Protracted yet coordinated differentiation of long-lived SARS-CoV-2-specific CD8⁺ T cells during convalescence. J. Immunol. 207, 1344–1356 (2021).
Article CAS PubMed Google Scholar
Neidleman, J. et al. SARS-CoV-2-specific T cells exhibit phenotypic features of helper function, lack of terminal differentiation, and high proliferation potential. Cell Rep. Med. 1, 100081 (2020).
Article CAS PubMed PubMed Central Google Scholar
Neidleman, J. et al. Distinctive features of SARS-CoV-2-specific T cells predict recovery from severe COVID-19. Cell Rep. 36, 109414 (2021).
Article CAS PubMed PubMed Central Google Scholar
Neidleman, J. et al. mRNA vaccine-induced T cells respond identically to SARS-CoV-2 variants of concern but differ in longevity and homing properties depending on prior infection status. eLife 10, e72619 (2021).
Article CAS PubMed PubMed Central Google Scholar
Suryawanshi, R. K. et al. Limited cross-variant immunity from SARS-CoV-2 Omicron without vaccination. Nature 607, 351–355 (2022).
Article CAS PubMed PubMed Central Google Scholar
Peluso, M. J. et al. Persistence, magnitude, and patterns of postacute symptoms and quality of life following onset of SARS-CoV-2 infection: cohort description and approaches for measurement. Open Forum Infect. Dis. 9, ofab640 (2022).
Article PubMed Google Scholar
Rydyznski Moderbacher, C. et al. Antigen-specific adaptive immunity to SARS-CoV-2 in acute COVID-19 and associations with age and disease severity. Cell 183, 996–1012 (2020).
Article CAS PubMed PubMed Central Google Scholar
Files, J. K. et al. Duration of post-COVID-19 symptoms is associated with sustained SARS-CoV-2-specific immune responses. JCI Insight 6, e151544 (2021).
PubMed PubMed Central Google Scholar
Ma, T. et al. Post-acute immunological and behavioral sequelae in mice after Omicron infection. Preprint at bioRxiv https://doi.org/10.1101/2023.06.05.543758 (2023).
Krishna, B. A. et al. Evidence of previous SARS-CoV-2 infection in seronegative patients with long COVID. EBioMedicine 81, 104129 (2022).
Article CAS PubMed PubMed Central Google Scholar
Khan, O. et al. TOX transcriptionally and epigenetically programs CD8⁺ T cell exhaustion. Nature 571, 211–218 (2019).
Article CAS PubMed PubMed Central Google Scholar
Devine, R. D. & Behbehani, G. K. Use of the pyrimidine analog, 5-iodo-2′-deoxyuridine (IdU) with cell cycle markers to establish cell cycle phases in a mass cytometry platform. J. Vis. Exp., https://doi.org/10.3791/60556 (2021).
Article PubMed Google Scholar
World Health Organization. A clinical case definition of post COVID-19 condition by a Delphi consensus. www.who.int/publications/i/item/WHO-2019-nCoV-Post_COVID-19_condition-Clinical_case_definition-2021.1 (2021).
Peluso, M. J. et al. Long-term SARS-CoV-2-specific immune and inflammatory responses in individuals recovering from COVID-19 with and without post-acute symptoms. Cell Rep. 36, 109518 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ferretti, A. P. et al. Unbiased screens show CD8⁺ T cells of COVID-19 patients recognize shared epitopes in SARS-CoV-2 that largely reside outside the spike protein. Immunity 53, 1095–1107 (2020).
Article CAS PubMed PubMed Central Google Scholar
Costantini, A. et al. Effects of cryopreservation on lymphocyte immunophenotype and function. J. Immunol. Methods 278, 145–155 (2003).
Article CAS PubMed Google Scholar
Cavrois, M. et al. Mass cytometric analysis of HIV entry, replication, and remodeling in tissue CD4⁺ T cells. Cell Rep. 20, 984–998 (2017).
Article CAS PubMed PubMed Central Google Scholar
Neidleman, J. et al. Phenotypic analysis of the unstimulated in vivo HIV CD4 T cell reservoir. eLife 9, e60933 (2020).
Article CAS PubMed PubMed Central Google Scholar
Steiner, S. et al. SARS-CoV-2 T cell response in severe and fatal COVID-19 in primary antibody deficiency patients without specific humoral immunity. Front. Immunol. 13, 840126 (2022).
Article CAS PubMed PubMed Central Google Scholar
Roederer, M., Nozzi, J. L. & Nason, M. C. SPICE: exploration and analysis of post-cytometric complex multivariate datasets. Cytometry A 79, 167–174 (2011).
Article PubMed PubMed Central Google Scholar
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
Article CAS PubMed PubMed Central Google Scholar
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
George, A. F. et al. Deep phenotypic analysis of blood and lymphoid T and NK cells from HIV⁺ controllers and ART-suppressed individuals. Front. Immunol. 13, 803417 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. https://doi.org/10.48550/arXiv.1406.5823 (2015).
Lenth, R., Singmann, H., Love, J., Buerkner, P. & Herve, M. Emmeans: estimated marginal means, aka least-squares means. R. package version 1 https://github.com/rvlenth/emmeans (2018).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1–21 (2014).
Article Google Scholar
Van der Laan, M. & Pollard, K. A new algorithm for hybrid clustering of gene expression data with visualization and the bootstrap. J. Stat. Plan Inference 117, 275–303 (2003).
Article Google Scholar
Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ahlmann-Eltze, C. & Huber, W. glmGamPoi: fitting Gamma-Poisson generalized linear models on single cell count data. Bioinformatics 36, 5701–5702 (2020).
Article CAS PubMed Central Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).
Article Google Scholar
McGinnis, C. S. et al. No detectable alloreactive transcriptional responses under standard sample preparation conditions during donor-multiplexed single-cell RNA sequencing of peripheral blood mononuclear cells. BMC Biol. 19, 10 (2021).
Article CAS PubMed PubMed Central Google Scholar
Xu, C. et al. Comprehensive multi-omics single-cell data integration reveals greater heterogeneity in the human immune system. iScience 25, 105123 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ianevski, A., Giri, A. K. & Aittokallio, T. Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data. Nat. Commun. 13, 1246 (2022).
Article CAS PubMed PubMed Central Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed Google Scholar
Holm, S. A. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979).
Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300 (1995).
Google Scholar
Crowell, H., Germain, P., Soneson, C., Sonrel, A. & Robinson, M. muscat: multi-sample multi-group scRNA-seq data analysis tools. R package version 1.14.10. https://github.com/HelenaLC/muscat (2023).
Blighe, K., Rana, S. & Lewis, M. Publication-ready volcano plots with enhanced colouring and labeling. R package version 1.18.10. https://github.com/kevinblighe/EnhancedVolcano (2020).
Huber, W. et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods 12, 115–121 (2015).
Article CAS PubMed PubMed Central Google Scholar
The Gene Ontology Consortium The gene ontology resource: 20 years and still going strong. Nucleic Acids Res 47, D330–D338 (2019).
Article Google Scholar
Ebrahimpoor, M., Spitali, P., Hettne, K., Tsonaka, R. & Goeman, J. Simultaneous enrichment analysis of all possible gene-sets: unifying self-contained and competitive methods. Brief. Bioinform. 21, 1302–1312 (2020).
Article PubMed Google Scholar
Ebrahimpoor, M. rSEA: simultaneous enrichment analysis. R package version 2.1.1. CRAN.R-project.org/package=rSEA (2020).
Assarsson, E. et al. Homogenous 96-plex PEA immunoassay exhibiting high sensitivity, specificity, and excellent scalability. PLoS ONE 9, e95192 (2014).
Article PubMed PubMed Central Google Scholar
Kolde, R. pheatmap: Pretty Heatmaps. R package version 1.0.12. https://github.com/raivokolde/pheatmap (2018).
Doncheva, N. T., Morris, J. H., Gorodkin, J. & Jensen, L. J. Cytoscape StringApp: network analysis and visualization of proteomics data. J. Proteome Res. 18, 623–632 (2018).
Article PubMed PubMed Central Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by the Van Auken Private Foundation, D. Henke, P. Taft and E. Taft; philanthropic funds donated to Gladstone Institutes by the Roddenberry Foundation and individual donors devoted to COVID-19 research; the Program for Breakthrough Biomedical Research, which is partly funded by the Sandler Foundation and awards 2164 and 2208 from Fast Grants, a part of Emergent Ventures at the Mercatus Center, George Mason University (to N.R.R.). We acknowledge the National Institutes of Health (NIH) DRC Center Grant P30 DK063720 and the S10 1S10OD018040-01 for use of the CyTOF instrument and the NIH S10 RR028962 and the James B. Pendleton Charitable Trust for use of the Fortessa X-20. This study was also funded by the Ministerium für Wissenschaft, Forschung und Kunst, Baden Württemberg, Germany (KNKC.031) and the Deutsche Forschungsgemeinschaft (DFG; German Research Foundation)—Projektnummer 316249678—SFB 1279 (to J.M.). Funding from the PolyBio Research Foundation supported both the experiments reported herein and the parent cohort (LIINC); specimen and clinical data collection were also supported by NIH 3R01AI141003-03S1 and NIH R01AI158013 (to M.J.P. and T.J.H.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

We thank S. Tamaki, V. Nguyen, P. Sanchez and C. Bispo for CyTOF assistance at the Parnassus Flow Core, J. Srivastava and V. Saware for technical assistance in flow cytometry, M. Karacan and N. Preising for technical assistance in peptide synthesis, E. Ghosn for guidance on annotation of cell clusters identified by scRNA-seq, J. Carroll for assistance on graphics, F. Chanut for editorial assistance and R. Givens for administrative assistance. We are grateful to the study participants and their medical providers. We acknowledge current and former LIINC clinical study team members T. Abualhsan, A. Alvarez, M. Arreguin, M. Buitrago, M. Deswal, N. DelCastillo, E. Fehrman, H. Grebe, H. Hartig, Y. Hernandez, M. Kerbleski, R. Kirtikar, J. Lombardo, M. Luna, L. Ngo, E. Ortiz, A. Rodriguez, J. Romero, D. Ryder, R. Sanchez, M. So, C. Song, V. Tai, A. Tang, C. Thanh, F. Ticas, L. Torres, B. Tran, D. Varma and M. Williams. We also acknowledge LIINC laboratory team members A. Buck, J. Donatelli, J. Hakim, N. Iyer, O. Janson, B. LaFranchi, C. Nixon, I. Thomas and K. Turcios. We thank J. Chen, A. Donovan and C. Forman for assistance with data entry and review. We thank the UCSF AIDS Specimen Bank for processing specimens and maintaining the LIINC biospecimen repository. We are grateful to E. Eilkhani and M. Deswal for regulatory support. We are also grateful for the contributions of additional current and former LIINC leadership team members—M. Durstenfeld, P. Hsue, B. Greenhouse, I. Rodriguez-Barraquer and R. Rutishauser.

Author information

These authors contributed equally: Kailin Yin, Michael J. Peluso.

Authors and Affiliations

Gladstone Institutes, University of California, San Francisco, San Francisco, CA, USA
Kailin Yin, Xiaoyu Luo, Reuben Thomas, Min-Gyoung Shin, Jason Neidleman, Alicer Andrew, Kyrlia C. Young, Tongcui Ma & Nadia R. Roan
Department of Urology, University of California, San Francisco, San Francisco, CA, USA
Kailin Yin, Xiaoyu Luo, Jason Neidleman, Alicer Andrew, Kyrlia C. Young, Tongcui Ma & Nadia R. Roan
Division of HIV, Infectious Diseases, and Global Medicine, University of California, San Francisco, San Francisco, CA, USA
Michael J. Peluso, Rebecca Hoh, Khamal Anglin, Beatrice Huang, Urania Argueta, Monica Lopez, Daisy Valdivieso, Kofi Asare, Rania Ibrahim & Steven G. Deeks
Division of Experimental Medicine, University of California, San Francisco, San Francisco, CA, USA
Tyler-Marie Deveau, Sadie E. Munter & Timothy J. Henrich
Core Facility Functional Peptidomics, Ulm University Medical Center, Ulm, Germany
Ludger Ständker & Jan Münch
Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
Scott Lu, Sarah A. Goldberg, J. Daniel Kelly & Jeffrey N. Martin
Zuckerberg San Francisco General Hospital and the University of California, San Francisco, San Francisco, CA, USA
Sulggi A. Lee
Division of Laboratory Medicine, University of California, San Francisco, San Francisco, CA, USA
Kara L. Lynch

Authors

Kailin Yin
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Peluso
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Luo
View author publications
You can also search for this author in PubMed Google Scholar
Reuben Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Min-Gyoung Shin
View author publications
You can also search for this author in PubMed Google Scholar
Jason Neidleman
View author publications
You can also search for this author in PubMed Google Scholar
Alicer Andrew
View author publications
You can also search for this author in PubMed Google Scholar
Kyrlia C. Young
View author publications
You can also search for this author in PubMed Google Scholar
Tongcui Ma
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca Hoh
View author publications
You can also search for this author in PubMed Google Scholar
Khamal Anglin
View author publications
You can also search for this author in PubMed Google Scholar
Beatrice Huang
View author publications
You can also search for this author in PubMed Google Scholar
Urania Argueta
View author publications
You can also search for this author in PubMed Google Scholar
Monica Lopez
View author publications
You can also search for this author in PubMed Google Scholar
Daisy Valdivieso
View author publications
You can also search for this author in PubMed Google Scholar
Kofi Asare
View author publications
You can also search for this author in PubMed Google Scholar
Tyler-Marie Deveau
View author publications
You can also search for this author in PubMed Google Scholar
Sadie E. Munter
View author publications
You can also search for this author in PubMed Google Scholar
Rania Ibrahim
View author publications
You can also search for this author in PubMed Google Scholar
Ludger Ständker
View author publications
You can also search for this author in PubMed Google Scholar
Scott Lu
View author publications
You can also search for this author in PubMed Google Scholar
Sarah A. Goldberg
View author publications
You can also search for this author in PubMed Google Scholar
Sulggi A. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Kara L. Lynch
View author publications
You can also search for this author in PubMed Google Scholar
J. Daniel Kelly
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey N. Martin
View author publications
You can also search for this author in PubMed Google Scholar
Jan Münch
View author publications
You can also search for this author in PubMed Google Scholar
Steven G. Deeks
View author publications
You can also search for this author in PubMed Google Scholar
Timothy J. Henrich
View author publications
You can also search for this author in PubMed Google Scholar
Nadia R. Roan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.Y. designed the experiments, performed CyTOF, flow cytometry and scRNA-seq experiments, conducted analyses, and prepared figures and tables. M.J.P. designed the LIINC cohort, oversaw LIINC cohort procedures and interpreted data. X.L. developed pipelines for data analyses and performed scRNA-seq. R.T. performed clustering and scRNA-seq analyses. M.S. performed RNA-seq and Olink analyses. J.N. prepared peptides and helped with experiments. A.A. and K.C.Y. prepared and analyzed CyTOF specimens. T.M. designed protocols for CyTOF analyses. R.H., K.A. and B.H. managed the LIINC cohort, recruited participants, collected clinical data and collected biospecimens. U.A., M.L., D.V., K.A., T.D. and S.E.M. recruited LIINC participants, collected clinical data and collected biospecimens. T.D. and S.E.M. processed specimens. L.S. and J.M. synthesized peptides. R.I. entered, cleaned and performed quality control on LIINC data. S.L. and S.A.G. managed LIINC data and selected biospecimens. K.L.L. performed antibody assays. S.A.L. designed the CHIRP cohort and oversaw CHIRP cohort procedures. J.D.K. and J.N.M. designed the LIINC cohort and interpreted LIINC clinical data. S.G.D. designed the LIINC cohort, oversaw cohort procedures and interpreted LIINC clinical data. T.J.H. designed the LIINC cohort, oversaw cohort procedures, performed the RNA-seq and Olink studies, interpreted data and prepared figures. N.R.R. conceived the study, performed supervision, conducted data analyses and prepared figures and tables. K.Y., M.J.P., T.J.H. and N.R.R. wrote the manuscript. All authors have read and approved this manuscript.

Corresponding authors

Correspondence to Timothy J. Henrich or Nadia R. Roan.

Ethics declarations

Competing interests

M.J.P. reports consulting fees from Gilead Sciences and AstraZeneca, outside the submitted work. S.G.D. reports grants and/or personal fees from Gilead Sciences, Merck & Co., Viiv, AbbVie, Eli Lilly, ByroLogyx and Enochian Biosciences, outside the submitted work. T.J.H. receives grant support from Merck and consults for Roche. All other authors report no conflicts of interest.

Peer review

Peer review information

Nature Immunology thanks Christina Zielinski and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Ioana Staicu, in collaboration with the Nature Immunology team. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Cohort characteristics, study design, and subset identification.

a–c, Number of sequelae symptoms at 4 (M4) and 8 (M8) months post-infection (n = 27 LC, n = 16 R) (a), and the numbers of individuals that were male or female (b) and that were hospitalized at the time of acute COVID-19 infection (c), in LC and R study participants. *p < 0.05 (two-sided paired sample t-test). d, The numbers of indicated co-morbidities in LC vs R study participants. e, BMI in LC vs R study participants. *p < 0.05 (two-sided student’s t-test). Horizontal bars indicate mean, error bars indicate SD, and dots represent individuals, with n = 27 LC and n = 16R. f. Schematic of experimental design and data analyses. Blood specimens from 27 LC and 16 R individuals were subjected to Olink, serology, CyTOF, and RNA-seq and scRNA-seq analysis. The indicated tools on the right were then used for analyses of the resulting high-dimensional datasets. g,h, Gating strategy to identify T cell populations. Intact, live, singlet cells from baseline (g) or SARS-CoV-2 peptide-treated (h) samples were gated for CD3⁺ T cells followed by sub-gating on CD4⁺ and CD8⁺ T cells as indicated. i,j, Gating strategy to define classical CD4⁺ (i) and CD8⁺ (j) T cell subsets.

Extended Data Fig. 2 Cytokine and effector molecule expression in SARS-CoV-2-specific T cells.

a,b CD4⁺ (a) or CD8⁺ (b) T cells from representative donor, stimulated (bottom) or not (top) with SARS-CoV-2 spike and T-scan peptides (Methods). Red boxes highlight the cytokines used to define the SARS-CoV-2-specific T cells. c,d The percentages of SARS-CoV-2-specific CD4⁺ (c) and CD8⁺ (d) T cells as defined by induction of IFN-γ, IL-2, CCL4, or TNF in response to SARS-CoV-2 peptide stimulations (two-sided student’s t-test). e,f, IL-6⁺ CD4⁺ T cells are observed in LC individuals. e, CD4⁺ T cells from representative donor, stimulated (right) or not (left) with SARS-CoV-2 spike and T-scan peptides (Methods). f, The percentages of SARS-CoV-2-specific CD4⁺ T cells inducing IL-6 in response to SARS-CoV-2 peptide stimulations. *p < 0.05 (two-sided Welch’s t-test). Horizontal bars indicate mean, error bars indicate SD, and dots represent individuals, with n = 27 LC and n = 16 R (c, d, f).

Extended Data Fig. 3 Subset distribution of total and SARS-CoV-2-specific CD8⁺ T cells among LC and R individuals.

a, Frequencies of T_N cells, T_SCM cells, T_CM cells, T_EM cells, T_TM cells, and T_EMRA cells among total CD8⁺ T cells from LC and R individuals (two-sided student’s t-test). b, Frequencies of T_N cells, T_SCM cells, T_CM cells, T_EM cells, T_TM cells, and T_EMRA cells among SARS-CoV-2-specific CD8⁺ T cells from LC and R individuals (two-sided student’s t-test).

Extended Data Fig. 4 MSI of CyTOF phenotyping markers among total CD4⁺ and CD8⁺ T cells from LC and R individuals.

Antigens are shown in the order listed in Supplementary Table 4. Results are gated on live, singlet CD4⁺ (a) or CD8⁺ (b) T cells. No significant differences were observed between LC and R individuals for any of the antigens (two-sided t-test with multiple correction by Sidak adjustment). Box plots represent the median (middle bar), 75% quartile (upper hinge) and 25% (lower hinge) with whiskers extending 1.5× interquartile range, dots represent individuals with n = 27 LC and n = 16 R.

Extended Data Fig. 5 MSI of CyTOF phenotyping markers among SARS-CoV-2-specific CD4⁺ and CD8⁺ T cells from LC and R individuals.

Results are similar to that shown in Extended Data Fig. 4, but gated on SARS-CoV-2-specific CD4⁺ (a) or CD8⁺ (b) T cells. No significant differences were observed between LC and R individuals for any of the antigens (two-sided t-test with multiple correction by Sidak adjustment). Box plots represent the median (middle bar), 75% quartile (upper hinge) and 25% (lower hinge) with whiskers extending 1.5× interquartile range, dots represent individuals with n = 27 LC and n = 16 R.

Extended Data Fig. 6 Activated T cells are not more abundant in individuals with LC.

The percentages of total CD4⁺ T cells (a), total CD8⁺ T cells (b), SARS-CoV-2-specific CD4⁺ T cells (c), and SARS-CoV-2-specific CD8⁺ T cells (d) expressing acute activation markers CD38, HLA-DR, and/or Ki67 in LC and R individuals (two-sided student’s t-tests). Horizontal bars indicate mean, error bars indicate SD, and dots represent individuals, with n = 27 LC and n = 16 R.

Extended Data Fig. 7 Sex-dimorphic T cell cluster distribution in individuals with LC.

a, Cluster distribution among total CD4⁺ T cells as depicted by UMAP. b, The distributions of CD4⁺ T cell clusters A1 and A4 in male and female individuals, with or without LC. Two-sided p-values were derived from a GLMM fit (see Methods). Individual points represent individuals, with n = 10 LC and n = 9 R in the male group and n = 17 LC and n = 7 R in the female group, and where the value corresponds to % of cells belonging to clusters A1 or A4. c, Expression levels of differentiation markers (CD45RA, CD45RO, CD27), activation markers (HLA-DR, OX40), tissue homing receptors (CD29, CXCR4), and lymph node homing receptors (CD62L, CCR7) on CD4⁺ T cell cluster A1 compared to total baseline CD4⁺ T cells. d, Expression levels of differentiation markers (CD45RA, CD45RO, CD27, CD57), cytolytic effectors (perforin, granzyme B), tissue homing receptors (CD29, CXCR4, CCR5), and lymph node homing receptors (CD62L, CCR7) on CD4⁺ T cell cluster A4 compared to total baseline CD4⁺ T cells. e, Cluster distribution among total CD8⁺ T cells as depicted by UMAP. f, The distributions of CD8⁺ T cell clusters B1 and B2 in male and female individuals, with or without LC. Two-sided p-values were derived from a GLMM fit (see Methods). Individual points represent individuals, with n = 10 LC and n = 9 R in the male group and n = 17 LC and n = 7 R in the female group, and where the value corresponds to % of cells belonging to clusters B1 or B2. g, Expression levels of differentiation markers (CD45RA, CD45RO, CD27), activation markers (HLA-DR, OX40), tissue homing receptors (CD29, CXCR4), and lymph node homing receptors (CD62L, CCR7) on CD8⁺ T cell cluster B1 compared to total baseline CD8⁺ T cells. h, Expression levels of differentiation markers (CD45RA, CD45RO, CD27, CD57), cytolytic effectors (perforin, granzyme B), tissue homing receptors (CD29, CXCR4, CCR5), and lymph node homing receptors (CD62L, CCR7) on CD8⁺ T cell cluster B2 compared to total baseline CD8⁺ T cells. ****p < 0.0001 (two-sided paired t-test, c,d,g,h). Horizontal bars indicate mean, error bars indicate SD, and dots represent individuals, with n = 27 LC and n = 16 R (b–d,f–h).

Extended Data Fig. 8 Flow cytometric validation and association analyses.

a, Association of flow cytometric (mean fluorescence intensity, MFI) vs CyTOF (MSI) expression levels of CXCR4, CXCR5, and CCR6. Data were analyzed by Pearson correlation coefficient and two-tailed unpaired t-tests. b, Flow cytometric gating strategy to identify memory CD4⁺ T cells expressing various combinations of CXCR4, CXCR5, and CCR6. c, The percentages of CXCR4⁺CXCR5⁺CD4⁺, CXCR5⁺CCR6⁺CD4⁺, and CXCR4⁺CCR6⁺CD4⁺ T cells in LC vs R individuals as determined by flow cytometry. *p < 0.05 (two-sided student’s t-test). d, The percentages of cells dually expressing PD1 and CTLA4 among SARS-CoV-2-specific CD8⁺ (left) or cells dually expressing IFN-γ and TNF among total CD8⁺ T cells (right), as determined by flow cytometry. *p < 0.05 (two-sided student’s t-test). Horizontal bars indicate mean, error bars indicate SD, and dots represent individuals, with n = 25 LC and n = 15 R (c,d). e, Associations of percentages of CXC4⁺CXCR5⁺CD4⁺ T cells or CXCR5⁺CCR6⁺CD4⁺ T cells with IL-4 levels in LC vs R individuals. Data were analyzed by Pearson correlation coefficient and two-tailed unpaired t-tests.

Extended Data Fig. 9 scRNAseq analysis reveals OR7D2 and ALAS2 expression in multiple subsets, validates tissue-homing chemokine receptor expression among LC CD4⁺ T cells, and identifies DEGs among subsets in LC individuals.

a,b, UMAP of cells analyzed by scRNA-seq among LC (n = 8) vs R (n = 4) individuals (a), and among the LC individuals classified as OR7D2^high (n = 4) vs. ALAS2^high (n = 4) (b). c, OR7D2 and ALAS2 expression in the OR7D2^high LC, ALAS2^high LC, and R individuals. **p < 0.01 (two-sided Welch two-sample t-test). Box plots represent the median (middle bar), 75% quartile (upper hinge) and 25% (lower hinge) with whiskers extending 1.5× interquartile range, dots represent individuals with n = 8 LC and n = 4R. d, UMAP depictions of cells expressing (blue) or not expressing (grey) OR7D2 or ALAS2 in individuals with LC. e, OR7D2 and ALAS2 expression in scRNA-seq-identified clusters labeled in Fig. 5d in individuals with LC, depicted as mean % of cells that were positive for OR7D2 or ALAS2 reads. f, Volcano plots showing LC vs R individuals for scRNA-seq-identified CD4⁺ T cell clusters 0 and 7, depicting CXCR4, CXCR5, and CCR6. g,h, Volcano plots depicting scRNA-seq-defined clusters 0, 1, 5, 7, and 8 for OR7D2^high vs. R (g), or clusters 1, 5, 6, 7, and 8 for ALAS2^high vs. R (h) individuals. DEGs with p < 0.05 (as determined empirical Bayes quasi-likelihood F-tests, with Benjamini-Hochberg correction) are labeled. Genes preferentially expressed in LC individuals are depicted on the right, and those preferentially expressed in R individuals on the left. The x-axes represent the log₂(fold-change) of the mean expression of each gene between the comparison groups, and the y-axes represent the raw –log₁₀(p-values). Dashed horizontal lines delineate the thresholds corresponding to Benjamini-Hochberg adjusted two-tailed p-values of <0.05 (Methods).

Extended Data Fig. 10 Validation of CyTOF antibodies.

a, CyTOF analysis of human lymphoid aggregate cultures generated from tonsils depicting CD3⁺ T cells on the top and CD3⁻ B cells on the bottom as indicated, analogous to methods previously described¹⁸. b, CyTOF analysis of PMA/ionomycin- or LPS-stimulated PBMCs, depicting CD3⁺ T cells on the top and CD3⁻ cells on the bottom, similar to prior studies^2,3,4,5. c, Expression of Foxp3 among CD4⁺ T_reg cells and CD4⁺ T_N cells, as assessed by CyTOF. d, Expression of CD30 and Ki67 among CD3⁺ CD45RO⁺CD45RA⁻CD4⁺ T memory (T_M) cells and CD4⁺ T_N cells, as assessed by CyTOF. ****p < 0.0001 (two-sided paired t-test). e, Illustration of pT_FH gate implemented on PBMC samples, and T_FH gate implemented on tonsil samples. Cells were pre-gated on CD4⁺ T_M cells.

Supplementary information

Supplementary Information

Supplementary Tables 1–13.

Reporting Summary

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yin, K., Peluso, M.J., Luo, X. et al. Long COVID manifests with T cell dysregulation, inflammation and an uncoordinated adaptive immune response to SARS-CoV-2. Nat Immunol 25, 218–225 (2024). https://doi.org/10.1038/s41590-023-01724-6

Download citation

Received: 09 February 2023
Accepted: 29 November 2023
Published: 11 January 2024
Issue Date: February 2024
DOI: https://doi.org/10.1038/s41590-023-01724-6

This article is cited by

Improvement of immune dysregulation in individuals with long COVID at 24-months following SARS-CoV-2 infection
- Chansavath Phetsouphanh
- Brendan Jacka
- Gail V. Matthews
Nature Communications (2024)