Signature of long-lived memory CD8+ T cells in acute SARS-CoV-2 infection

Immunological memory is a hallmark of adaptive immunity and facilitates an accelerated and enhanced immune response upon re-infection with the same pathogen1,2. Since the outbreak of the ongoing COVID-19 pandemic, a key question has focused on which SARS-CoV-2-specific T cells stimulated during acute infection give rise to long-lived memory T cells3. Here, using spectral flow cytometry combined with cellular indexing of transcriptomes and T cell receptor sequencing, we longitudinally characterized individual SARS-CoV-2-specific CD8+ T cells of patients with COVID-19 from acute infection to 1 year into recovery and found a distinct signature identifying long-lived memory CD8+ T cells. SARS-CoV-2-specific memory CD8+ T cells persisting 1 year after acute infection express CD45RA, IL-7 receptor-α and T cell factor 1, but they maintain low expression of CCR7, thus resembling CD45RA+ effector memory T cells. Tracking individual clones of SARS-CoV-2-specific CD8+ T cells, we reveal that an interferon signature marks clones that give rise to long-lived cells, whereas prolonged proliferation and mechanistic target of rapamycin signalling are associated with clonal disappearance from the blood. Collectively, we describe a transcriptional signature that marks long-lived, circulating human memory CD8+ T cells following an acute viral infection. Evidence of a transcriptional signature that marks precursors of long-lived CD8+ memory T cells in SARS-CoV-2 infection.

Immunological memory is a hallmark of adaptive immunity and facilitates an accelerated and enhanced immune response upon re-infection with the same pathogen 1,2 . Since the outbreak of the ongoing COVID-19 pandemic, a key question has focused on which SARS-CoV-2-specific T cells stimulated during acute infection give rise to long-lived memory T cells 3 . Here, using spectral flow cytometry combined with cellular indexing of transcriptomes and T cell receptor sequencing, we longitudinally characterized individual SARS-CoV-2-specific CD8 + T cells of patients with COVID-19 from acute infection to 1 year into recovery and found a distinct signature identifying long-lived memory CD8 + T cells. SARS-CoV-2-specific memory CD8 + T cells persisting 1 year after acute infection express CD45RA, IL-7 receptor-α and T cell factor 1, but they maintain low expression of CCR7, thus resembling CD45RA + effector memory T cells. Tracking individual clones of SARS-CoV-2-specific CD8 + T cells, we reveal that an interferon signature marks clones that give rise to long-lived cells, whereas prolonged proliferation and mechanistic target of rapamycin signalling are associated with clonal disappearance from the blood. Collectively, we describe a transcriptional signature that marks long-lived, circulating human memory CD8 + T cells following an acute viral infection.
The coronavirus disease 2019 (COVID-19) pandemic has taken an extraordinary toll on global health and economy, affecting billions of lives all over the world. The ongoing vaccination efforts appear to curtail the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and prevent severe disease, even as new virus variants emerge 4,5 . Yet, prevailing questions concern whether and how exposure to SARS-CoV-2 by infection or immunization might result in long-term protective immunity.
On encountering their cognate antigen on antigen-presenting cells, antigen-specific CD8 + T cells proliferate and differentiate into effector cells aimed at controlling the pathogen by killing virus-infected host cells. Following virus elimination, 90-95% of effector T cells undergo apoptosis, whereas some antigen-specific T cells survive to become long-lived memory T cells that are able to protect the host from re-infection with the same pathogen 2,6 .
While antigen-specific effector T cell responses are generated during acute SARS-CoV-2 infection 7-12 and persist for several months [13][14][15][16][17] , little is known about changes in memory phenotypes over time. Previous studies using live-attenuated virus vaccines in healthy donors [18][19][20][21] have described phenotypical trajectories of human antigen-specific T cell populations. However, it is unknown whether infection with a natural virus generates comparable memory T cell responses in humans, as infection route, viral load, inflammation and various host-related factors are likely to affect T cell responses and memory formation. Moreover, phenotypical and transcriptional trajectories at the single T cell receptor (TCR) level and the factors instructing individual effector

Fig. 1 | Characteristics of antigen-specific CD8 + T cells during acute and memory phases of SARS-CoV-2 infection. a,
Overview of study design. PBMC, peripheral blood mononuclear cell. b, Representative plots of CoV2-Dex staining. PE, Phycoerythrin. Numbers in the plots indicate percentage of parent population. c, Frequency of CoV2-Dex + cells in healthy donors and patients with COVID-19 during acute infection and 6 months and 1 year after infection. Each dot represents an independent donor at the indicated timepoint (n = 10 healthy, n = 37 acute, n = 32 6 months, n = 29 1 year after infection). P values are shown. d, Linear regression of frequency of CoV2-Dex + cells 6 months after infection as a function of CoV2-Dex + cell frequencies during acute infection (n = 11). The P value was calculated with t-statistic. e, Uniform manifold approximation and projection (UMAP) plots of marker expression for up to 2,000 CD8 + T cells from each sample collected during acute infection (n = 37) analysed by spectral flow cytometry. Regions with high marker expression appear in red. An overlay of CoV2-Dex + cells (red) and total CD8 + T cells (grey) is shown in the top left. f, Representative histograms showing expression of selected markers on CoV2-Dexand CoV2-Dex + cells. g, Frequency of Ki-67 + , HLA-DR + , granzyme B + , CX3CR1 + and CD127 + cells in CoV2-Dex -(grey) and CoV2-Dex + cells during acute infection and 6 months and 1 year after infection. Analysis was conducted on paired samples from acute infection versus 6 months and/or 1 year after infection (n = 28 acute, n = 24 6 months, n = 29 1 year). The grey lines connect individual donors sampled at different timepoints. P values were calculated using a Wilcoxon-Mann-Whitney test in c and g and corrected for multiple comparisons in g. All tests were performed two-sided.
Article not determine whether HLA-A*24:02 CoV2-Dex + cells had an activated or proliferating phenotype due to low cell numbers. Furthermore, we noted a lower reactivity to HLA-A*11:01 dextramers than to HLA-A*01:01 dextramers (Extended Data Fig. 2c) during acute infection, which persisted 6 months after infection. These data suggest heterogeneity in effector and memory T cell responses based on HLA type and specific epitopes, although they need careful interpretation due to limited patient numbers.
In patients with an HLA-A*01:01 or HLA-A*11:01 allele, the frequency of CoV2-Dex + cells during acute infection correlated with the frequency of specific cells at the memory phase (Fig. 1d). In acute infection, flow cytometry analysis of CoV2-Dex + cells revealed a circumscript phenotype of activated cells, dominated by high abundance of Ki-67 and HLA-DR (Fig. 1e, f, Extended Data Fig. 2d, e). CoV2-Dex + cells also expressed granzyme B and tended to have higher expression of the terminal differentiation marker CX3CR1, whereas surface CD127 (also known as IL-7 receptor-α) was markedly downregulated (Fig. 1f, g). At the 6-month and 1-year timepoints, frequencies of Ki-67 + and HLA-DR + CoV2-Dex + cells declined and the frequency of CD127 + cells increased (Fig. 1f, g), indicating a transition from effector to memory state 6,23 . We did not observe phenotypical differences between HLA-A*01:01 and HLA-A*11:01 (Extended Data Fig. 2f).
Notably, disease severity seemed to positively correlate with the extent of CD8 + T cell responses during acute infection, as well as frequencies of CoV2-Dex + cells 6 months and 1 year after infection, although expansion of CoV2-Dex + cells was also evident in patients with mild disease (Extended Data Fig. 3a). During acute infection, both proliferation and activation were only minimally affected by disease severity in the CoV2-Dex + compartment, whereas a relevant difference was observed in CoV2-Dexcells (Extended Data Fig. 3b). This discrepancy could be due to higher bystander activation in severe disease or higher abundance of undetected SARS-CoV-2-reactive T cells. Patients with severe disease showed higher expression of granzyme B and CX3CR1 on CoV2-Dex + cells, possibly indicating a different T cell differentiation program during the acute phase of severe COVID-19. These differences were no longer evident 6 months and 1 year after infection (Extended Data Fig. 3c, d).
Unbiased clustering revealed 12 distinct CD8 + T cell clusters (Fig. 2a), none of which was dominated by a single patient (Extended Data Fig. 4c). Some clusters showed nearly complete segregation between the acute and memory phases (Extended Data Fig. 5a, b). In line with our flow cytometry data (Fig. 1e, Extended Data Fig. 5c), CoV2-Dex + CD8 + T cells showed a rather segregated transcriptional makeup during acute infection, whereas their transcriptional state was more heterogeneous 6 months after infection (Fig. 2b). Comparing the contribution of CoV2-Dex + cells to different clusters, we observed that clusters 1, 2 and 12 dominated the CoV2-Dex + CD8 + T cell response in the acute phase, whereas clusters 3, 6 and 11 became prominent in the recovery phase (Fig. 2c, Extended Data Fig. 5d). While clusters 1, 2 and 12 corresponded to cytotoxic, activated and proliferating cells, respectively, cluster 3 showed a signature marked by enrichment of NF-κB and Jun/ Fos signalling, cluster 6 displayed an oxidative phosphorylation signature, and cluster 11 showed a dual signature marked by enrichment of interferon (IFN) response genes and genes encoding the effector cytokines IFNγ, tumour necrosis factor (TNF) and lymphotoxin-α (LTα) (Fig. 2d, e). Similarly, among genes with significantly higher expression in CoV2-Dex + cells from the acute phase versus the recovery phase, we found genes related to cytotoxicity (GZMA, GZMK and PFN1), activation (HLA-DRA, CD38 and PDCD5) and proliferation (MKI67, MCM7 and NUDC1), along with IFN response genes (IFI6, MX1, IFI27L2 and IFI44L) (Extended Data Fig. 5e). SELL (which encodes CD62L) appeared to be enriched in cells retrieved during the acute phase rather than the recovery phase (Extended Data Fig. 5e).
To identify phenotypical trajectories in individual antigen-specific T cell clones, we performed TCR sequencing of CoV2-Dex + cells, which revealed several antigen-specific CD8 + T cell clones for each epitope investigated (Fig. 2f). Clones were considered antigen-specific if any of the clonal cells were CoV2-Dex + (data available as Supplementary Dataset 1), and clones that were CoV2-Dex + in the acute phase were considered CoV2-Dex + independently of CoV2-Dex staining at six months after infection, and vice versa. The number of clones detected during convalescence was markedly lower than that detected during the acute phase of infection (Fig. 2f). In most cases, but not all, dominant clones in the acute phase corresponded to the largest clones found in the recovery phase (Fig. 2g). The phenotypical changes in acute infection versus the recovery phase on the general CoV2-Dex + population were reflected in individual T cell clones. Thus, analysis of individual CoV2-Dex + clones showed multiple clones containing cells from clusters 1, 2 or 12 during acute infection and cells from clusters 3, 6 and 11 during recovery (Fig. 2h, Extended Data Fig. 6). To better compare gene expression in acute infection versus recovery across all clones, we compiled an 'acute gene signature' comprising NKG7, PRF1, GZMB, CENPU, CENPF and MKI67, and a 'recovery gene signature' comprising TNF, IFIT2, IFIT3, MT-CO1, MT-CO2 and MT-ATP6. We observed a significant decrease in acute gene signature transcripts in individual T cell clones from the acute phase to the recovery phase, which was paralleled by an increase in the recovery gene signature (Fig. 2i). Accordingly, individual T cell clones showed a decrease in MKI67 and HLA-DRB5 expression between the acute phase and the recovery phase (Fig. 2j).

Memory paths of SARS-CoV-2 + CD8 + cells
To better understand the phenotypical memory trajectories of antigen-specific CD8 + T cells following a naturally occurring acute virus infection, we followed CoV2-Dex + cells longitudinally, at both the population level and the clonal level. In the acute phase, CoV2-Dex + cells showed mostly an effector/effector memory (T effector /T EM ) phenotype, whereas frequencies of naive (T naive ) cells were lower in CoV2-Dex + than in CoV2-Dex -CD8 + T cells ( Fig. 3a, b, Extended Data Fig. 7a).
These data were confirmed in CoV2-Pent + cells (Extended Data Fig. 7b, c). At 6 months and 1 year after infection, we observed a progressive switch from a T effector /T EM phenotype to a terminally differentiated T EM cell re-expressing CD45RA (T EMRA ) phenotype; thus, 1 year after infection, most CoV2-Dex + cells were of a T EMRA phenotype (Fig. 3c, Extended Data Fig. 7d, e). Furthermore, we observed progressive enrichment in stem cell memory T (T SCM ) cells, particularly at the 1-year timepoint (Fig. 3c, Extended Data Fig. 7e). Of note, the increase in T SCM cell enrichment was accompanied by an enrichment in T naive cells 1 year after infection ( Fig. 3c), possibly indicating that memory cells can reacquire CD45RA and CCR7 also in the absence of CD95 expression. We did not observe differences in memory phenotypes based on HLA, except for a tendency towards more central memory (T CM ) cells in HLA-A*11:01 1 year after infection (Extended Data Fig. 7f-h). Conversely, memory phenotypes were strongly influenced by disease severity (Extended Data Fig. 7i-k). Patients with severe disease had fewer CoV2-Dex + T naive cells 6 months and 1 year after infection, tended to have fewer T SCM cells and showed predominantly CoV2-Dex + T EMRA cells 1 year after infection. When examining individual T cell clones, we observed an increase in CD45RA expression and a concomitant decrease in CCR7 protein expression determined by TotalSeq from acute infection to 6 months after infection (Fig. 3d), thus confirming an enrichment of a T EMRA phenotype also on a single TCR level.
Subsequently, we assessed whether the T effector /T EM and T EMRA phenotypes were associated with specific T cell markers, suggesting distinct differentiation states. Indeed, in CoV2-Dex + cells, we observed several differences between the T effector /T EM and T EMRA populations. CoV2-Dex + T EM cells showed higher expression of Ki-67 and HLA-DR, whereas they had lower abundance of CX3CR1 already during acute infection (Fig. 3e).
Notably, we observed the same phenotypical differences between CoV2-Dex + T EM and CoV2-Dex + T EMRA cells 6 months and 1 year after infection ( Fig. 3f, g).
As T cell phenotypes are driven by specific transcription factors, we assessed the expression of T cell factor 1 (TCF1), T-box expressed in T cells (T-BET), eomesodermin (EOMES) and thymocyte selection-associated    Article high-mobility group box (TOX), which are transcription factors known to have important roles in T cell differentiation [24][25][26][27] . CoV2-Dex + cells downregulated TCF1 expression during the acute phase, which was progressively restored at subsequent timepoints (Fig. 3h, i, Extended Data Fig. 8a-c). Conversely, the expression of T-BET was increased in the acute phase (Fig. 3h, Extended Data Fig. 8a) and progressively decreased 6 months and 1 year after infection (Extended Data Fig. 8b, c). A difference in T-BET expression between T effector /T EM and T EMRA CoV2-Dex + cells was not evident, except for a tendency 1 year after infection. However, T EMRA CoV2-Dex + cells expressed lower levels of TCF1 and TOX in the memory phase and lower levels of EOMES at all timepoints (Extended Data Fig. 8d).

Signatures of CD8 + memory precursors
Next, we sought to identify the factors present during acute infection that instruct T cell clones towards a memory fate. We compared clones detected in the peripheral blood in both the acute and the convalescent phases (termed persistent) to those that were only seen in the acute phase and became undetectable in the convalescent phase (non-persistent) (Fig. 4a). Not all clones detected at 6 months after infection were present in the acute phase, probably reflecting a limitation of detection (Extended Data Fig. 9a). Generally, clone size correlated positively with persistence (Fig. 4b). Cells of persistent clones showed a different transcriptional makeup in the acute phase when compared to cells of non-persistent clones (Fig. 4c), which also resulted in a different distribution in the previously identified CD8 + T cell clusters (Extended Data Fig. 9b). This effect was robustly seen in different clones and was not due to a few hyper-expanded clones (Extended Data   = 139 non-persistent). c, UMAP plot of persistent (red) versus non-persistent (green) CoV2-Dex + CD8 + T cell clones detected during acute infection. d, Gene set enrichment analysis showing enrichment of genes associated with cytokine signalling in persistent clones and mTOR signalling and proliferation in non-persistent CoV2-Dex + T cell clones. Red dashed lines indicate minimal and maximal cumulative enrichment values. P value calculation was performed as detailed in the Method section. e, Expression of selected genes and CCR7 and CD45RA protein determined by Totalseq for persistent versus non-persistent CoV2-Dex + T cell clones. P values were calculated using a Wilcoxon signed-rank test; a Bonferroni correction was applied for multiple comparisons. f, Expression level of selected genes in persistent versus non-persistent individual T cell clones; each dot represents one clone. g, Expression level of selected genes in cells from a single non-persistent clone compared to cells from a single persistent T cell clone; each dot represents one cell (n = 5 CASSQVIGNQPQHF, n = 16 CASSAPGPLTTQYF). In f, g, the white diamonds indicate median expression. For b, f, g, P values were calculated using a Wilcoxon-Mann-Whitney test. All tests were performed two-sided. Fig. 9c). Gene set enrichment analysis revealed distinct signatures in persistent versus non-persistent clones. Genes involved in IFNγ and IFNα responses and TNF signalling were enriched in cells from persistent clones, whereas mechanistic target of rapamycin (mTOR) signalling and mitosis-related genes were enriched in cells from non-persistent clones (Fig. 4d). By comparing differentially regulated genes between cells from persistent and non-persistent clones, we observed genes associated with activation (HLA-DQA1 and HLA-DPA1), terminal differentiation (KLRG1) and cytotoxicity (GZMM and NKG7), as well as certain IFN-induced (B2M and HLA-C) and TNF-induced (CCL4) genes to be enriched in persisters, along with CD45RA protein expression determined by TotalSeq (Fig. 4e). Conversely, cells from non-persistent clones showed higher expression of CTLA4, TIM3 (encoded by HAVCR2), Ki-67 (encoded by MKI67) and the mTOR-induced gene CORO1A (Fig. 4e). The same differences in gene expression could be observed at the clonal level; thus, genes were upregulated or downregulated accordingly in non-persistent individual clones compared to persistent clones (Fig. 4f), as exemplified in two selected clones of a patient (Fig. 4g). Non-persisters showed higher expression of SELL during the acute phase (Fig. 4e, f). We also observed differential TCR-Vβ usage between persistent and non-persistent clones (Fig. 4e).

Discussion
In this study, we address outstanding questions related to CD8 + T cell memory upon acute SARS-CoV-2 infection in humans by longitudinally following individual memory CD8 + T cell clones. Phenotypically, we find a transition of CD8 + T cells from T effector /T EM cells to T EMRA cells with progressive enrichment of TCF1 + cells, which is paralleled by a modest enrichment in T SCM cells. While two previous papers have reported a high prevalence of T EMRA cells among SARS-CoV-2-specific CD8 + T cells 13,28 , our study provides a description of progressive enrichment in this specific phenotype during the memory phase, at both the clonal level and the population level, revealing that CD8 + T EMRA cells might constitute the main circulating memory subset following an acute viral infection in humans.
Our data provide a different and more detailed view of individual antigen-specific human memory CD8 + T cells than the one observed in the tetramer-positive memory CD8 + T cell population in individuals vaccinated against yellow fever virus 19 , where the prevalent subset observed was T SCM . As that study dealt with human memory CD8 + T cells examined several years after vaccination, T EMRA and T SCM cells could be part of the same phenotypical trajectory, with progressive enrichment in T SCM cells over time due to differentiation or competitive advantage. We favour an alternative hypothesis based on the phenotypical differences between memory cells in mild and severe COVID-19 that we observed, suggesting that other factors-such as antigen availability, type of antigen-presenting cells stimulated and cytokine milieu-might influence the type of memory formed, with increased T EMRA cell differentiation upon severe disease versus prevalent T SCM cell formation in mild disease and upon inoculation with live-attenuated virus vaccines.
We also observed enrichment of CD62L expression in CoV2-Dex + cells during the acute phase rather than the recovery phase and in non-persistent clones disappearing from the circulation rather than in clones giving rise to circulating memory T cells. Whereas CD62L expression is a hallmark of recirculating T CM cells 29 , which are the prevalent memory cells in lymph and secondary lymphoid organs 30 , CD62L is not typically associated with the T EMRA phenotype 31 . This might explain our findings, as CD62L expression appears not to be part of the memory differentiation trajectory observed 20 . Whether a fraction of T cell clones detected in peripheral blood during the acute phase contributes to different memory pools in secondary lymphoid organs cannot be excluded in our present study.
Understanding how the immune system maintains the balance between effector response and memory formation could provide insights on why some infections result in robust and long-lasting T cell memory, whereas others fail to do so. Our study helps to unravel the complexity of these processes by finding a transcriptional signature at the level of T cell clones that correlates with the acquisition of long-lived, circulating memory T cells. We find that a strongly proliferative phenotype is associated with clonal contraction and disappearance. Furthermore mTOR signalling, probably stimulated by TCR engagement, appears to instruct the fate of short-lived effector cells, similar to previous results in mice 32 and in vitro studies 33 . Conversely, cytokine signalling marks cells destined to become long-lived, circulating memory cells, in agreement with previous studies showing the importance of type I IFN for memory generation 34 . As we sampled SARS-CoV-2-specific T cells from the peripheral blood, we can only infer persistence of CD8 + T cell clones in this compartment. Importantly, memory phenotypes and the factors instructing their differentiation might vary in different immunological contexts, such as the lymph node or tissue. Similarly, as we could examine specific cells only from donors with HLA-A*01:01 and HLA-A*11:01 alleles and, to some extent, donors with an HLA-A*24:02 allele, further studies will be needed to compare our findings in other HLA types.
Collectively, our data demonstrate the formation of memory CD8 + T cells to be dependent on a delicate balance between cytokine and TCR signalling during acute infection, which in turn influences outcomes of long-lived, circulating memory T cells in humans.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-021-04280-x.

IgA and IgG immunoassays
Spike S1-specific IgA and IgG antibodies were assessed with a commercial ELISA kit (SARS-CoV-2 IgA and IgG immunoassay, Euroimmun), as previously described 35 . OD ratios higher than 2.0 and 1.1 were considered positive for serum IgA and IgG, respectively.

Dextramer and pentamer staining
PBMCs (4 × 10 6 ) per patient were incubated with Human TruStain FcX blocking reagent (422302, BioLegend) for 10 min at 4 °C. After washing, cells were incubated with MHC-I dextramers (see Supplementary  Table 1) in the presence of l-biotin and herring sperm DNA according to the manufacturer's instructions, for 10 min at room temperature. Two peptides presented on HLA-A*01:01 dextramers (FTSDYYQLY from ORF3a and TTDPSFLGRY from ORF1ab), two peptides presented on HLA-A*11:01 dextramers (ATEGALNTPK and KTFPPTEP from nucleocapsid protein) and one peptide presented on HLA-A*24:02 dextramers (QYIKWPWYI from spike protein) were included. For MHC-I pentamer staining, cells were incubated for 10 min at 37 °C with pentamers (Extended Data Table 1). One peptide presented on HLA-A*01:01 pentamers (FTSDYYQLY from ORF3a) and one peptide presented on HLA-A*11:01 pentamers (KTFPPTEP from nucleocapsid protein) were included. Frozen PBMCs were used throughout the study.

Spectral flow cytometry
After dextramer or pentamer staining, a concentrated surface staining antibody mix (Supplementary Tables 2, 3) was applied without washing and samples were incubated at room temperature for further 20 min. After four rounds of washing, cells were resuspended in a fixation permeabilization solution (eBioscience Foxp3/transcription factor staining buffer) and incubated for 60 min at room temperature. After washing, an antibody mix for intracellular staining (Supplementary  Tables 2, 3) was added and cells were incubated for 30 min at room temperature. After washing, samples were acquired on a Cytek Aurora spectral flow cytometer using the SpectroFlo software. Quality control for the cytometer was performed daily. Data were analysed with FlowJo (version 10.7.1) and OMIQ (www.omiq.ai). Phenotypes were evaluated only in patients with more than 5 CoV2-Dex + cells per sample (n = 28 acute, n = 24 6 months after infection, n = 29 1 year after infection) or more than 5 CoV2-Pent + cells per sample (n = 7 acute, n = 9 6 months after infection). Correlation between frequency of CoV2-Dex + cells during acute infection and frequency of CoV2-Dex + cells 6 months after infection was assessed only for patients sampled at least 14 days after symptom onset (n = 11). A concentrated antibody mix containing TotalSeq antibodies (see Supplementary Table 4 for a complete list) was applied after dextramer staining without washing and cells were incubated at 4 °C for 30 min. After four rounds of washing, cells were resuspended in PBS with 2% FBS and 2 mM EDTA and sorted with a BD Aria cell sorter. For each patient, CoV2-Dexand CoV2-Dex + cells were sorted approximately in a 10:1 ratio. All CoV2-Dex + cells from each sample were sorted, the corresponding amount of CoV2-Dexcells was calculated and sorted in the same tube. Cells from ten patients at the same timepoint were pooled together, generating four individual sample sets in total:

scRNA-seq library preparation and sequencing
Cells of sample sets 1-7 were analysed by scRNA-seq utilizing the 5′ Single Cell GEX and VDJ v1.1 platforms (10x Genomics). Each sample set was processed individually. Cell suspensions were pelleted, resuspended and loaded into the Chromium Chip following the manufacturer's instructions. Fourteen cycles of initial cDNA amplification were used for all sets and single-cell sequencing libraries for whole-transcriptome analysis (GEX), TCR profiling (VDJ), and combined cell-surface protein and dCODE Dextramers detection (ADT) were generated. Final libraries were quantified using a Qubit Fluorometer, pooled in a ratio of 5:1:1 (GEX:VDJ:ADT) and sequenced on a NovaSeq 6000 system with the following cycle configuration: read 1: 28 bp; index read 1: 10 bp; read 2: 101 bp.

Single-cell transcriptome analysis
Raw scRNA-seq FASTQ files were aligned to the human GRCh38 genome with Cell Ranger version 5.0.0 with default settings for the 'cellranger multi' pipeline (10x Genomics). The reference genome was downloaded from the 10x Genomics website (https://cf.10xgenomics. com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz) and built as per official release notes (https://support.10xgenomics.com/ single-cell-gene-expression/software/release-notes/build#GRCh38_ 2020A). Every sample set was analysed with the 'cellranger multi' pipeline, which allows to process together the paired GEX, ADT and VDJ libraries for each set. Downstream analysis was conducted in R version 4.1.0 with the package Seurat version 4.0.3 (ref. 38 ). Cells with fewer than 200 or more than 2,500 detected genes and cells with more than 10% detected mitochondrial genes were excluded from the analysis.
To investigate possible patient biases, we demultiplexed cells from patient pools 1-6 based on genetic variants detected within the scRNA-seq reads. For this, we used the tool souporcell version 2 (ref. 39 ). To cluster cells based on their patient-specific genetic variants, we merged sample sets 1, 2 and 5 (comprising sorted cells from both timepoints of patients CoV2_001-CoV2_010 and unsorted cells of the same patients) and sets 3, 4 and 6 (comprising cells from both timepoints of patients CoV2_011-CoV2_020 and unsorted cells of the same patients). Then, we executed the souporcell pipeline with option k = 10 (the number of clusters to be determined) for each of the two merged sample sets. This analysis allowed us to classify 88% of the cells passing the filtering steps from above into 20 genotype-driven 'patient' clusters.
After log normalization and variable feature calculation, independent datasets were integrated using Seurat's anchoring-based integration method. Data scaling, principal component analysis, clustering and UMAP visualizations were performed on the integrated dataset using 15 principal components and a resolution of 0.5 for the shared nearest-neighbour clustering algorithm. To define distinct biological features of cell clusters, differential gene expression analyses were done on assay 'RNA' of the integrated dataset. FindAllMarkers was executed with logfc.threshold and min.pct cut-offs set to 0.25. For the analysis of clusters, FindMarkers was used with default settings for the comparison of persistent and non-persistent clones. For the differential expression analysis of manually selected genes and cell-surface proteins (CD45RA and CCR7), logfc.threshold and min.pct cut-offs were set to 0.
For gene set enrichment analysis, the FindMarkers function from Seurat was first used for the differential expression of genes between cells belonging to persistent and non-persistent clones (using the default Wilcoxon rank-sum test, with options 'min.pct=0.1, logfc.threshold = -Inf', to account also for small expression changes, as long as the genes were expressed in at least 10% of cells of at least one group). The resulting 4,701 genes were pre-ranked in decreasing order by the negative logarithm of their P value, multiplied for the sign of their average log-fold change (in R, '-log(P_val)*sign(avg_log 2 FC)'). Gene set enrichment analysis 40 was performed on this pre-ranked list using the R package FGSEA (https://github.com/ctlab/fgsea/) 41 . We used the FGSEA-simple procedure with 100,000 permutations and the hallmark gene sets for Homo sapiens from the Molecular Signatures Database (https://www. gsea-msigdb.org/gsea/msigdb/index.jsp, made accessible in R by the package msigdbr; https://github.com/cran/msigdbr) and set the seed value ('set.seed(42)' in R) before execution to make the results reproducible. For significance testing, the function fgsea::fgsea() was used, which performs a P value estimation based on an adaptive multi-level split Monte-Carlo scheme. A multiple hypothesis correction procedure was applied to get adjusted P values. The results were filtered for gene sets that were significantly enriched with adjusted P < 0.1.

TCR profiling
Paired chain TCR sequences were obtained through targeted amplification of full-length V(D)J segments during library preparation. Sequence assembly and clonotype calling was done through cellranger's immune profiling pipeline (cellranger multi). TCR profiling on filtered contig annotations was done using R package scRepertoire version 1.1.4 (ref. 42 ), which assigns TCR nucleotide and amino acid sequences together with clonal frequency counts and a clonotype classification to each cell. The function combineTCR was executed with filterMulti=T to isolate the top two expressed chains in cell barcodes with multiple chains. Clonotypes were called based on the amino acid sequence of the CDR3 region of TCRα and TCRβ chains. For cells of which only one of the two chains could be identified, the available chain was used. Clone calling was done for each sample set independently before integration.

SARS-CoV-2 peptide-loaded dextramer binding of CD8 + T cells
To identify SARS-CoV-2-specific CD8 + T cells, we used dCODE Dextramers loaded with viral peptides presented on MHC-I molecules as described above. To assess unspecific binding, a negative control dextramer (peptide STEGGGLAY presented on HLA-A*01:01) and a general negative control dextramer were included. After analysis of the flow cytometry data, we noticed strong background staining of dextramer HLA-A*24:02 (peptide QYIKWPWYI) in samples of healthy donors, indicating unspecific dextramer binding. Thus, we excluded all sequencing counts from this dextramer in the downstream analysis. For other dextramers, cells were considered CoV2-Dex + when the unique molecular identifier (UMI) count of a CoV2-dextramer was more than ten and more than five times higher than the UMI count of the negative control in the same cell. Cells that were positive for more than one dextramer according to this classification (less than 0.2% of all cells with known TCR) were excluded from the analysis. A TCR clone was considered SARS-CoV-2-specific when at least one cell of the clone was CoV2-Dex + .

Statistics
Wilcoxon-Mann-Whitney test was used for comparisons of two independent groups. Wilcoxon signed-rank test was used for paired testing. P values were adjusted for multiple comparisons with the Holm method. A linear regression model was used to quantify the relationship between variables. Significance was assessed by non-parametric methods unless otherwise specified. All tests were performed two sided. Analyses were performed with R (version 4.0.0 or 4.1.0).

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability
The sequencing dataset generated in this study has been deposited at zenodo.org and is available at https://zenodo.org/record/5770747. Flow cytometry datasets are available from the corresponding author on reasonable request.