An immunodominant NP105–113-B*07:02 cytotoxic T cell response controls viral replication and is associated with less severe COVID-19 disease

NP105–113-B*07:02-specific CD8+ T cell responses are considered among the most dominant in SARS-CoV-2-infected individuals. We found strong association of this response with mild disease. Analysis of NP105–113-B*07:02-specific T cell clones and single-cell sequencing were performed concurrently, with functional avidity and antiviral efficacy assessed using an in vitro SARS-CoV-2 infection system, and were correlated with T cell receptor usage, transcriptome signature and disease severity (acute n = 77, convalescent n = 52). We demonstrated a beneficial association of NP105–113-B*07:02-specific T cells in COVID-19 disease progression, linked with expansion of T cell precursors, high functional avidity and antiviral effector function. Broad immune memory pools were narrowed postinfection but NP105–113-B*07:02-specific T cells were maintained 6 months after infection with preserved antiviral efficacy to the SARS-CoV-2 Victoria strain, as well as Alpha, Beta, Gamma and Delta variants. Our data show that NP105–113-B*07:02-specific T cell responses associate with mild disease and high antiviral efficacy, pointing to inclusion for future vaccine design. Peng et al. find that immunodominant cytotoxic T lymphocytes (CTLs) specific for NP105–113-B*07:02 are associated with reduced COVID-19 severity. Mechanistically, NP105–113-B*07:02-specific CTLs show potent antiviral functionality and may represent rational T cell vaccine targets.

C D8 + T cells play a well-documented role in clearing viral infections. Immunodominance is a central feature of CD8 + T cell responses in viral infections and understanding the nature of this response for a given infection, where they are shown to be protective, will be critical for the design of vaccines aiming to elicit optimal CD8 + T cell responses 1,2 .
The role of the immunodominant cytotoxic T cell immune response in protection and potential disease pathogenesis of SARS-CoV-2 infection is currently poorly defined. We and others have identified immunodominant T cell epitopes restricted by common human leukocyte antigen (HLA) types [3][4][5][6] ; in particular, we found multiple dominant epitopes in nucleoprotein (NP) restricted by HLA-B*07:02, -B*27:05, -B*40:01, -A*03:01 and -A*11:01. We also found that multi-functional NP and membrane (M) CD8 + T cell responses are associated with mild disease; NP is one of the most common targets for CD8 + -dominant T cell responses in SARS-CoV-2 infection 3 .
Among the dominant epitopes identified to date, NP 105-113 -B*07:02 appears to be among the most dominant 3,4,6,7 ; notably, no variants are found within this epitope from over 300,000 sequences in COG-UK global sequence data alignment 8 . This suggests that this epitope would be a good target for inclusion within an improved vaccine design, expanded to stimulate effective CD8 + T cell responses, as well as neutralizing antibodies, to protect against newly emergent viral strains that escape antibody responses to spike in some cases 9 .

An immunodominant NP 105-113 -B*07:02 cytotoxic T cell response controls viral replication and is associated with less severe COVID-19 disease
Biased TRBV27 gene usage, with long CDR3β loops preferentially expressed in NP 105-113 -B*07:02-specific T cell receptor (TCR), has been observed in both unexposed and COVID-19-recovered individuals 10 . The present study suggested a role for cross-reactive responses in COVID-19 based on pre-existing immunity to seasonal coronaviruses or other pathogens. However, a subsequent study suggested that the immunodominant NP 105-113 -B*07:02 CD8 + T cell responses are unlikely to arise from pre-existing cross-reactive memory pools, but rather represent a high frequency of naive T cell precursors found across HLA-B*07:02-expressing individuals 7 .
In this study, we present an in-depth analysis to explore correlations across NP 105-113 -B*07:02-specific T cell responses, TCR repertoires and disease severity. We saw stronger overall T cell responses in individuals recovered from severe COVID, which may be explained by high exposure to viral protein; however, we found an immunodominant epitope response (HLA-B*07:02 NP 105-113 -specific CD8 + ) which significantly associated with mild cases. Importantly, this epitope is one of the most dominant CD8 + T cell epitopes reported so far by us and others. We examined potential mechanisms of protection using single-cell transcriptome analysis, and functional evaluation of expanded T cell clones bearing the same TCRs as those identified in single-cell analysis. We also assessed the ability of T cell lines and clones to mount effective effector function against cells infected with live SARS-CoV-2 virus and vaccinia virus-expressing SARS-CoV-2 proteins. We found that NP 105-113 -B*07:02 is the dominant NP response in HLA-B*07:02positive patients with mild symptoms, with high frequency and higher magnitude when compared with severe cases. Single-cell analysis revealed that preserved beneficial functional phenotypes are associated with protection from severe illness and have better overall antiviral function. In addition, NP 105-113 -B*07:02-specific T cells can recognize the naturally processed epitope in live virus and recombinant vaccinia virus-infected cells, which correlates with antiviral efficacy.

Results
NP 105-113 -B*07:02-specific T cell responses are stronger in patients recovered from mild COVID-19 infection. A previous study has identified five dominant CD8 + epitopes targeting NP, including the most dominant epitope NP 105-113 (amino acid sequence SPRWYFYYL) restricted by HLA-B*07:02 (ref. 3 ). This present study includes 52 individuals who recovered from COVID-19, comprising 30 mild cases and 22 severe cases (including 4 with critical illness; clinical features summarized in Supplementary Table 1 and Extended Data Fig. 1a-c). All the patients were HLA typed and 19 (36.5%) were HLA-B*07:02 positive (10 mild and 9 severe cases; Extended Data Fig. 1d). We proceeded to carry out ex vivo interferon (IFN)-γ ELISpot assays using HLA-B*07:02-positive convalescent samples 1-3 months postinfection. Of HLA-B*07:02 individuals, 79% (15/19) showed responses to this epitope, which accounted for 29% of individuals from the overall cohort (15/52) (Fig. 1a), including 90% (9/10) of individuals recovered from mild and 67% (6/9) from severe disease (Fig. 1b). This further confirms the dominance of this NP 105-113 -B*07:02 T cell response in our cohort, in particular in individuals recovered from mild illness. In addition, individuals recovered from mild disease made significantly stronger responses to this epitope, compared with those who had recovered from severe disease ( Fig. 1c; P = 0.04). We also observed that this NP 105-113 -B*07:02-specific response is dominant in mild cases and makes up 60% of overall NP responses of each individual, whereas, in severe cases, the proportion is substantially lower, with an average of 19.5% ( Fig. 1d; P = 0.015). In addition, we did not find HLA-B*07:02 association with disease outcome in our study cohorts ( Fig. 1e; 77 acute and 52 convalescent patients). Our data highlight the association of the strength of this dominant epitope-induced T cell response with mild disease outcome and provide evidence that this link is epitope specific rather than a wider allelic association with HLA-B*07:02.
Strong cytotoxicity and inhibitory receptor expression are associated with disease severity. To explore the mechanisms underlining this association, we sorted NP 105-113 -B*07:02-specific T cells at a single-cell level with peptide major histocompatibility complex class I (MHC-I) pentamers using flow cytometry. We performed single-cell analysis using SmartSeq2 for peripheral blood mononuclear cell (PBMC) samples from four convalescent patients, including two who recovered from mild COVID-19 infection (C-COV19-005, age 56 years and C-COV19-046, age 76 years) and two who recovered from severe disease in early infection (C-COV19-038, age 44 years and C-COV19-045, age 72 years). TCR sequences and transcriptomic profiles of each single cell were analyzed.
Analysis of single-cell RNA-sequencing (scRNA-seq) data with UMAP visualization and unbiased clustering revealed a homogeneous cell population; therefore we compared gene expression of CD8 + NP-specific, sorted single cells isolated from mild (n = 208 from 2 patients) and severe cases (n = 140 from 2 patients) by scoring expression levels of manually defined gene sets (Supplementary Table 2). Gene signatures associated with T cell cytotoxicity and inhibitory receptors were analyzed and compared between severity groups. We found that cells from patients who had recovered from severe COVID-19 have significantly higher cytotoxicity gene expression scores ( Fig. 2a; P = 0.00032), with upregulation of GZMK (P = 3.02 × 10 −5 ) and GNLY (P = 1.41 × 10 −9 ) (encoding granzyme K and granulysin, respectively) (Fig. 2b). These cells also displayed increased inhibitory receptor expression ( Fig. 2c; P = 0.00072), such as TIGIT, CTLA4 and HAVCR2 (TIM3). This supports findings published by us and others 3,11 where patients with severe COVID-19 disease have been exposed to higher antigen loads, and that these cells are still present at 1-3 months convalescence, rather than CD8 + central memory T cells.
NP 105-113 -B*07:02-specific T cells have a highly diverse TCR repertoire. Consistent with findings by other studies 7,10 , we found that NP 105-113 -B*07:02-specific T cells from our cohort showed very broad TCR repertoires. Circos plots show paired TCR α and β chains (V and J gene usage) from the four individuals analyzed with SmartSeq2 scRNA-seq (Fig. 3a), and the combined TCR repertoire of all four patients represented by the TCR clonotype (defined separately for each patient combining V gene and CDR3 amino acid sequence) (Fig. 3b). Although the NP 105-113 -specific TCR repertoire is diverse, with unique pairings of Vα and Vβ genes, we observed that 15/45 (33.3%) of unique Vβ clonotypes were paired with several distinct Vα clonotypes. By contrast, there is only 1/55 (1.8%) Vα clonotype that pairs to multiple Vβ clonotypes; this highlights the importance of studying Vβ in the TCR repertoire. Further detailed TCR information can be found in Supplementary Table 3.
CDR3β sequences from patients with mild COVID-19 display higher similarity to naive precursors. Several studies have reported that pre-existing, cross-reactive T cells to SARS-CoV-2 can be detected in unexposed individuals, and these T cells may have resulted from previous human seasonal coronavirus infection 7,10,12,13 . These studies found TCRs specific to NP 105-113 -B*07:02 in SARS-CoV-2-unexposed and -infected individuals. These cells were revealed as likely to be naive 7 ; this is very different from the central/effector memory phenotype of SARS-CoV-2-specific T cells reported earlier. To investigate this further, we sought to determine what role these T cells might play in the early stages of SARS-CoV-2 infection and COVID-19 disease, and if these cells contribute to the association of mild disease due to their specificity for this NP-dominant epitope.
To take advantage of the results from our SmartSeq2 scRNA-seq, we first compared TCR sequences from our four convalescent patients with COVID-19 with prepandemic TCR sequences from healthy donors, published by Lineburg et al. 10 , Nguyen et al. 7 and another study cohort, COMBAT 14 . The COMBAT dataset represents a comprehensive multi-omic blood atlas encompassing acute patients with varying COVID-19 severity (41 mild and 36 severe), and 10 healthy volunteers (prepandemic), using bulk TCR sequencing and CITE-Seq, which combines single-cell gene expression and cell-surface protein expression. TCR sequences from the Lineburg and Nguyen datasets have been experimentally validated to be specific for the NP 105-113 epitope; however, for the COMBAT dataset, we used GLIPH2 analysis 15 to extract TCRs with predicted specificity to this epitope based on convergence with known NP 105-113 -specific TCRs.
We sought to compare NP-specific TCRs from COVID patients and healthy individuals using two different methodologies. First, we calculated similarity scores for CDR3β amino acid sequences between pairwise combinations of SmartSeq2 TCRs and prepandemic/healthy TCRs. A similarity score of 1 indicates that the pair of CDR3β sequences are identical, whereas a score of 0 indicates complete dissimilarity. In our convalescent patient cohort, CDR3β from patients with mild disease are more similar to TCRs from prepandemic/healthy individuals, than those from severe patients ( Fig. 4a; P < 2.20 × 10 −16 ). Second, we looked at the proportion of TCR sequences from patients with mild and severe disease (acute cases from COMBAT dataset and convalescent cases from previously described SmartSeq2 patients) that can be found in the same convergence groups as sequences from healthy donors, indicating high CDR3β similarity. Convergence groups containing TCRs from healthy donors appear to contain higher proportions of TCRs from mild cases rather than severe, signifying greater similarity between TCRs from prepandemic individuals and patients with mild disease ( Fig. 4b; P < 2.2 × 10 −16 ). and nonresponders (n = 4) with mild or severe COVID-19 disease. c, Comparison of the magnitude of the response to the nP 105-113 epitope between HLA-B*07:02-positive convalescent patients with COVID-19 (n = 10 mild, n = 9 severe). d, Proportion of nP 105-113 -specific response to overall nP response (n = 10 mild, n = 9 severe). e, Proportion of HLA-B*07:02 individuals compared with combined total acute and convalescent COVID-19 patients (n = 77 acute, n = 52 convalescent). Data are presented as medians with interquartile ranges (IQrs) (c and d). The Mann-Whitney U-test was used for analysis and the two-tailed P value was calculated: *P < 0.05. s.f.u., spot-forming units.
We were able to link predicted NP 105-113 -B*07:02TCRs with their corresponding single-cell data from the COMBAT dataset (healthy and acute SARS-CoV-2-infected patients). In this way, we could extract single-cell CITE-seq information from the COMBAT dataset, subsetted specifically to cells with predicted NP 105-113 specificity. Cellular subtyping of these CD8 + NP 105-113 -B*07:02 T cells show a higher proportion of naive T cells in one HLA-B*07:02 healthy individual compared with predominantly T effector memory subtypes in patients with acute COVID-19 (n = 17, Fig. 4c). Overall, our data support the report that T cells bearing TCRs specific to NP 105-113 -HLA-B*07:02 in SARS-CoV-2-unexposed individuals are unlikely to have resulted from previous seasonal coronavirus infection 7 . This reinforces the finding that only NP 105-113 -B*07:02-specific T cells from acute HLA-B*07:02-positive patients are exposed to antigen and undergo T cell differentiation, whereas NP 105-113 -specific T cells in prepandemic individuals are naive precursors rather than memory cells from a previous crossreactive infection.
Broad range and high functional avidity are associated with clonotype expansion in mild disease. In parallel with single-cell sorting for SmartSeq2, we also sorted, cloned and expanded NP 105-113 -B*07:02-specific T cells from the same convalescent patients with COVID-19 in vitro 16,17 to obtain pure clonal T cell populations 16,17 . We sequenced TCRs from each T cell clone with paired TCR α-chain and β-chain of each clone listed in Supplementary Table 4. When comparing the TCR sequences between T cell clones and ex vivo single cells, in vitro expanded T cell clones are a good representation for the T cells isolated for ex vivo single-cell analysis, with expanded TCRs from ex vivo single cells present as dominant TCRs from the T cell clones (Extended Data Fig. 2a).
To provide a link between T cell clones and single-cell data by their respective TCR sequences, we divided all the T cells, including T cell clones and single cells from SmartSeq2, into 18 groups according to their unique human T cell receptor β variable (TRBV) gene usage and CDR3β sequence (Table 1). T cell functional avidity was measured by IFN-γ ELISpot and calculated from the half-maximal effective concentration (EC 50 ) (Extended Data Fig. 2b and Supplementary Table 5). We found evidence for low and high functional avidity groups (Fig. 5a) based on the EC 50 of T cell clones, with EC 50 < 0.11 considered to be high-avidity and EC 50 > 0.11 low-avidity T cells. We then aggregated RNA counts from single cells (pseudobulk) to compare differences in gene expression between the two avidity groups. Although there were only seven The Mann-Whitney U-test was used for analysis and the two-tailed P value was calculated: ***P < 0.001, ****P < 0.0001.
significantly differentially expressed genes (Fig. 5b), possibly as a result of small sample sizes and patient variation, differentially expressed genes of note upregulated in high functional avidity cells include IL10RA, PARK7 and LTA4H. The interaction of interleukin (IL)-10 with IL-10 receptor subunit α (IL10RA) expressed on CD8 + T cells has been reported to directly decrease CD8 + T cell antigen   sensitivity in patients with chronic hepatitis C infection 18 , whereas Parkinson's disease protein 7 (PARK7) promotes survival and maintains cellular homeostasis in the setting of intracellular stress 19 . Leukotriene A4 hydrolase (LTA4H) is an enzyme with known potent anti-inflammatory activity, which functions as an aminopeptidase to degrade a neutrophil chemoattractant Pro-Gly-Pro (PGP) to facilitate the resolution of neutrophilic inflammation and prevent prolonged inflammation with exacerbated pathology and illness 20 . This supports the idea that high functional avidity T cells undergo stronger antigen stimulation and would therefore start expressing immune-dampening molecules. We further found that patients with mild disease show an increased proportion of high functional avidity TCR clonotypes, which are also more expanded than low functional avidity TCR clonotypes (Fig. 5c), whereas TCR clonotypes from patients with severe disease show equal expansion between high and low functional avidity TCRs. Therefore, the preferential expansion of high functional avidity TCR clonotypes may contribute to mild disease after SARS-CoV-2 infection.
The strength of T cells responding to naturally processed epitope correlates with their functional avidity. Numerous studies including our own have shown the importance of antigen processing and presentation to T cell recognition of its antigen 21,22 . Some T cell epitopes may not be processed and presented as efficiently as others, which will subsequently diminish the T cell response to the epitope. To investigate T cell responses to naturally processed and presented viral epitopes, we made vaccinia virus-expressing SARS-CoV-2 viral proteins. We infected autologous Gating for CD107a-and/or MIP1β-producing cells was based on corresponding negative controls (Extended Data Fig. 3a). When compared with the peptide-loaded targets, we found that the response to vaccinia virus-infected BCLs was much weaker, consistent with lower antigen loads. The loading of this naturally processed and presented epitope was equivalent to no more than 3 nM peptide (Extended Data Fig. 3b). Nevertheless, NP vaccinia virus-incubated clones with high CD107a expression showed a negative correlation with their individual EC 50 values ( Fig. 6b; Speaman's rank correlation coefficient (R) = −0.6176, P = 0.0212), consistent with higher functional avidity resulting in more effective T cell killing. A similar negative correlation was also observed with MIP1β-producing cells ( Fig. 6c; R = −0.6879, P = 0.0082).
To further investigate the antiviral activity of NP 105-113 -B*07:02specific T cells, we established an in vitro SARS-CoV-2 infection system. Briefly, the angiotensin-converting enzyme 2 (ACE2) gene was delivered into autologous EBV-transformed BCLs by lentiviral transduction to enable SARS-CoV-2 infection via ACE2 protein expressed on the cell-surface. ACE2 + BCLs were purified by flow sorting and maintained by antibiotic selection, after which cells were subsequently used for SARS-CoV-2 virus infection (Victoria strain). After 48 h of incubation, intracellular viral copies were quantified by quantitative (q)PCR, where the reduction of virus replication is calculated as a percentage of virus suppression by T cells (Fig. 6d). We found that the percentage of virus suppression was strongly correlated with their functional avidity ( Fig. 6e; R = −0.7699, P = 0.0075). Therefore, high functional avidity T cells can efficiently inhibit viral replication.
NP 105-113 -B*07:02-specific T cells are maintained with preserved antiviral efficacy. Six months after infection. To examine whether the memory T cells established postnatural infection could provide sufficient protection against secondary viral infection, we collected PBMCs from three patients (C-COV19-005, C-COV19-045 and C-COV19-046) 6 months after infection and sequenced sorted CD8 + NP 105-113 -B*07:02-specific T cells. We discovered that, 6 months after infection, the TCR repertoire of NP 105-113 -B*07:02-specific T cells narrows (independent of cell numbers), and the T cell memory pool contains both high and low functional avidity T cells (Fig. 7a). We then isolated and expanded further NP 105-113 -B*07:02-specific T cell bulk lines from PBMC samples taken 6 months after infection. We assessed the antiviral efficacy of these bulk T cell lines in our in vitro SARS-CoV-2 infection assays. All three T cell lines showed increased MIP1β and CD107a protein expression after incubation with NP-expressing vaccinia virus (Extended Data Fig. 4), increased tumor necrosis factor (TNF) and CD107a expression after incubation with BCLs infected with SARS-CoV-2 virus (Victoria strain) and current variants of concerns (VOCs), including the Delta variant ( Fig. 7b and Extended Data Fig. 5). In addition, we found that these antigen-specific bulk cell lines are capable of suppressing SARS-CoV-2 replication (Fig. 7c) and showed strong inhibition against VOCs, including the recently emerged Alpha, Beta, Gamma and Delta SARS-CoV-2 variants (Fig. 7d,e). This is consistent with the evidence of conservation of this NP 105-113 -B*07:02 epitope, and indicates the protective role of NP 105-113 -specific T cells in secondary infection against different SARS-CoV-2 variants.

Discussion
Our observation of strong and dominant NP 105-113 -B*07:02-specific T cell responses in mild cases highlights the possible protective role of this unique and most dominant response found so far in SARS-CoV-2 infection 3-6 . We found high similarity and convergence of TCRs in HLA-B*07:02-positive healthy and recovered individuals, with naive precursors identified in prepandemic samples supporting previous reports 7, 10 . In addition, T cells from convalescent patients with mild disease show higher functional avidity as well as better effector and antiviral function compared with convalescent patients with severe COVID-19. It is interesting that the immune memory pools postinfection (6 months convalescence) are narrowed but remain proportional; we found no bias toward high or low functional avidity TCRs during immune memory contraction. Moreover, this dominant NP 105-113 -specific response restricted by HLA-B*07:02 is associated with protection against severe disease, but does not associate with HLA-B*07:02 when analyzed alone.
The highly diverse TCR repertoire of NP 105-113 -B*07:02-specific T cells in recovered individuals is of particular interest; whether this is a common phenomenon of acute primary virus infection or these responses are unique, with high frequency and broader choice of TCR precursors available, would merit future investigation. The latter is supported by our finding that TCRs in COVID-19-recovered individuals can be similar to those found in prepandemic individuals, in particular patients with mild symptoms. We hypothesize that NP 105-113 -B*07:02-specific T cell responses play an important role in protecting individuals from severe illness, which is probably due to early priming and expansion of high-frequency naive TCRs specific to this epitope.
We further provide evidence to support our hypothesis by studying a cohort of patients with acute SARS-CoV-2 infection, by analyzing the TCR repertoire in HLA-B*07:02-positive patients. We first found high frequencies of TCR precursors with naive phenotype in HLA-B*07:02-positive healthy donors; this further supports the recent findings, from Nguyen et al. 7 , that these T cell precursors bearing NP-specific TCRs are not due to pre-existing memory from seasonal coronaviruses. We observed that strong cytotoxicity and inhibitory receptor expression are associated with disease severity, where NP 105-113 -B*07:02-specific T cells are more activated and well differentiated in individuals recovered from severe illness. This is probably the result of stronger antigen stimulation and expansion during the acute phase of viral infection.
We found overall high functional avidity T cell expansion in mild cases, and that high functional avidity is associated with expression of immune-damping molecules such as IL10RA, PARK7 and LTA4H, which could potentially act to prevent prolonged  inflammation with exacerbated pathology and illness 18,20,23,24 . In particular, LTA4H has a known function as an aminopeptidase to degrade a neutrophil chemoattractant PGP, facilitating the resolution of neutrophilic inflammation, which is known to be associated with immunopathology in respiratory virus infections such as COVID-19 (ref. 25 ). This further provides evidence that expansion of high avidity precursors in mild cases contributes to the overall protective immunity from severe illness. We show that NP 105-113 -B*07:02-specific T cells can respond to cells infected with live SARS-CoV-2 virus as well as emerging viral variants, and most importantly suppress virus replication in infected cells. The magnitude and strength of the response to naturally processed epitopes presented by infected cells correlate with their functional avidity. The proportional expansion with both high and low functional avidity T cells was maintained in CD8 + T cell memory pools after immune memory contraction (at 6 months postinfection), and these cells could suppress virus replication efficiently for all viral variant strains. This is not surprising due to the conservation of this epitope across viral strains, and provides some reassurance that memory T cells generated from natural infection could respond to newly emerged variants and still provide protective immunity.
Taken together, we have demonstrated that, first, we found strong association of NP 105-113 -HLA-B*07:02-specific T cell response with mild disease; second, the protective effect of NP 105-113 -HLA-B*07:02specific TCRs from severe illness may be due to early expansion of high-frequency naive T cell precursors bearing these TCRs. Moreover, we found that the TCR repertoire is not disturbed after virus infection and immune memory contraction, and that these memory T cells are able to suppress the original SARS-CoV-2    We recognize that there are a number of limitations to the present study, for example, the number of convalescent patients analyzed by single-cell gene expression and TCR sequencing (n = 4) is small. Also, the number of NP 105-113 -B*07:02-specific cells from prepandemic donors and patients with acute COVID-19 is low, partly because these cells were not pentamer sorted before analysis. In the present study, we focus on CD8 + T cell responses to a single epitope; however, it may be useful in the future to see whether there are any distinct features or features shared with other dominant responses. Although our data support high-frequency naive T cell precursors probably contributing to mild disease outcome, it is also possible, as the consequence of high viral load and overstimulation caused by  high functional avidity T cells (with higher proportion of precursor TCRs), leading to exhaustion and depletion during the acute virus infection, which merits further investigation, including larger cohorts sizes. We found that a higher proportion of TCR sequences from mild cases converged with those from prepandemic individuals, although it may be possible that this observation arose from higher numbers of TCRs from mild patients used as input for this convergence analysis. The specifics of antigen loading of this particular epitope, compared with other NP epitopes, as well as variation in levels of protein expression and localization, are also unknown and warrant further investigation.

Online content
Any methods, additional references, Nature Research reporting summaries, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41590-021-01084-z. Fig. 7 | Characterization of NP 105-113 -B*07:02-specific T cell responses at 6 months convalescence. a, TCr repertoires of three patients at 1 month and 6 months convalescence. TRBV gene usage of common and expanded TCr clonotypes (defined as TRBV and TRBJ gene usage) are labeled for clarity. TCr clonotypes colored pink are low functional avidity and blue ones depict high functional avidity; clonotypes colored gray do not have similar TCrs to T cell clones. C-COV19-46 6-month cells were sequenced by 10× single-cell sequencing, and C-COV19-005 and C-C0V19-045 by bulk TCr sequencing. nA, not available. b, representative ICS flow cytometry plots measuring TnF-α and CD107a expression on bulk nP 105-113 -specific T cell lines from C-COV19-046 incubated with SArS-CoV-2 Victoria, Alpha, Beta, Gamma or Delta variant-infected BCLs. c, Inhibition of SArS-CoV-2 viral replication (Victoria strain) by C-COV19-046 bulk nP 105-113 -specific T cell lines from 1-month (gray bars) and 6-month (red bars) convalescent samples (n = 2 biological replicates). Data are shown as mean ± s.d., representing three independent experiments with similar results. d, Antiviral activity of nP 105-113 -specific bulk T cells from 6 months convalescence against SArS-CoV-2 VOCs: Alpha (purple bars), Beta (blue bars) and Gamma (green bars) (n = 3 biological replicates). Data are shown as mean ± s.d., representing three independent experiments with similar results. e, Antiviral activity of nP 105-113 -specific bulk T cells from 6 months convalescence against SArS-CoV-2 VOCs: Victoria strain (gray bars) and Delta variant (orange bars) (n = 6 biological replicates). Data are shown as mean ± s.d., representing two independent experiments with similar results.

Study participants and clinical definitions.
Patients were recruited from the John Radcliffe Hospital in Oxford, UK, between March 2020 and April 2021 by identification of patients hospitalized during the SARS-CoV-2 pandemic. Patients were recruited into the Sepsis Immunomics study and had samples collected during acute disease and convalescence. Patients were sampled at least 28 d after symptom onset. Written informed consent was obtained from all patients. Ethical approval was given by the South Central-Oxford C Research Ethics Committee in England (ref. 19/SC/0296). Clinical definitions were defined as previously described 3 .
Generating ACE2-transduced EBV-transformed BCLs. EBV-transformed BCLs were generated as described previously 26 . The complementary DNA for the human ACE2 gene (ENSG00000130234) was cloned into a lentiviral vector that allows coexpression of enhanced green fluorescent protein and a puromycin resistance marker (Addgene, plasmid no. 17488). The plasmids were cotransfected with packaging plasmids pMD2.G and psPAX2 into HEK293-TLA using PEIpro

Generating T cell lines and clones.
Short-term SARS-CoV-2-specific T cell lines were established as previously described 17 . Briefly, 3 × 10 6 to 5 × 10 6 PBMCs were pulsed for 1 h at 37 °C with 10 μM peptides, containing T cell epitope regions and cultured in R10 (RPMI 1640 medium with 10% fetal calf serum, 2 mM glutamine and 100 mg ml −1 of penicillin-streptomycin) at 2 × 10 6 cells per well in a 24-well Costar plate. IL-2 was added to a final concentration of 100 U ml −1 on day 3 and cultured for a further 10-14 d. T cell clones were generated by sorting HLA-B*07:02 NP 105-113 pentamer + CD8 + T cells at a single-cell level from thawed PBMCs or short-term cell lines. T cell clones were then expanded and maintained as described previously 27 .

IFN-γ ELISpot assay.
Ex vivo IFN-γ ELISpot assays were performed using either freshly isolated, cryopreserved PBMCs or antigen-specific T cell clones as described previously 3 . For ex vivo ELISpots, peptides were added to 2 × 10 5 PBMCs per test at 2 μg ml −1 for 16-18 h. When using T cell clones, autologous EBV-transformed BCLs were first loaded with peptides at threefold titrated concentrations and subsequently cocultured with T cells at an effector:target (E:T) ratio of 1:50 for at least 6 h. To quantify antigen-specific responses, data were collected with AID ELISpot 7.0, mean spots of the control wells were subtracted from the positive wells (phytohemagglutinin stimulation) and the results expressed as spot-forming units (s.f.u.) per 10 6 PBMCs. Responses were considered positive if results were at least three times the mean of the negative control wells and >25 s.f.u. per 10 6 PBMCs. If negative control wells had >30 s.f.u. per 10 6 PBMCs or positive control wells were negative, the results were excluded from further analysis.

Evaluation of T cell response to vaccinia virus infection. EBV-transformed BCLs
were infected with Lister strain vaccinia virus at an MOI of 3 for 90-120 min at 37 °C. Cells were washed to remove any virus and incubated overnight in R10 at 37 °C. Cells were counted and cocultured with T cells at an E:T ratio of 1:1. Degranulation (CD107a expression) and cytokine production of T cells were evaluated by ICS as described above.  38 ).

Evaluation of T cell response to live virus infection. EBV-transformed
SmartSeq2 scRNA-seq analysis. Cells were filtered using the following criteria: minimum number of cells expressing specific gene = 3, minimum number of genes expressed by cell = 200 and maximum number of genes expressed by cell = 4,000. Cells were excluded if they expressed more than 5% mitochondrial genes. Patient-specific cells were integrated using Harmony v.1.0 to remove batch effects. The AddModuleScore function (Seurat) was used to look at the expression of specific gene sets (Supplementary Table 2). The average expression of a gene set was calculated, and the average expression levels of control gene sets were subtracted to generate a score for each cell relating to that particular gene set.
Higher scores indicate that that specific signature is expressed more highly in a particular cell compared with the rest of the population. Module scores were plotted using ggplot2 v.3.3.2 (ref. 39 ). To group as many single cells into one of these 18 groups, the stringsim function was used (stringdist v.0.9.6 (ref. 43 )) to compare the similarity between all SmartSeq2 CDR3β sequences and each of the 18 CDR3β from the single cell/clone grouping. A minimum similarity score of 0.7 was used to decide whether a TCR from a single cell should belong to one of the 18 groups. Once allocated, the single cell was annotated as being high or low functional avidity based on its group number.
TCR sequencing from T cell clones (bulk sequencing). BCL files were converted to FASTQ files as described earlier. TCRs were extracted using MiXCR and the resulting output files (TRA and TRB) were parsed into R using tcR as described earlier. TCRs were filtered to retain 1α1β for each clone. TCR clonotypes (defined as Vβ gene usage and CDR3β sequence) were compared between single TCR and bulk TCR sequencing using ggalluvial v.0.12.2 (ref. 44 ). The predicted functional avidity annotation was overlaid on to the plots using the stringsim function as previously described to classify TCRs into high or low functional avidity groups (minimum score 0.5).
VDJ 10× sequencing. Raw BCL files were processed using 10× Genomics Cellranger v.5.0.0 (ref. 45 ). For donor deconvolution from multiplexed single-cell data, cellSNP v.0.3.2 (ref. 46 ) was used to generate a list of SNPs from Cellranger output (BAM file). Vireo v.0.5.6 (ref. 47 ) was used to demultiplex the sequencing data into individual patients from the pooled sequenced libraries, based on previously generated SNP-list TCRs from 10× sequencing representing 6-month convalescence, and were compared with 1-month convalescence TCRs (SmartSeq2) from the same patient using ggalluvial. The predicted functional avidity annotation was overlaid on to the plots using the stringsim function, as previously described, to classify TCRs into high or low functional avidity groups (minimum score 0.5).

Gene expression analysis and cell subtyping from acute COVID-19 dataset.
Normalized single-cell gene expression data for T cells from the COMBAT dataset (level 2 subsets a and b) 14 was annotated with specific T cell subtypes according to COMBAT multimodal analysis, COMBAT TCR chain information and patient metadata. Any cells without both a CD8 + multimodal major cell type classification and TCR chain information were excluded from further analysis. A simplified severity grouping based on the World Health Organization's ordinal scale, which ranges from 0 to 8 (https://www.who.int/blueprint/priority-diseases/key-action/ COVID-19_Treatment_Trial_Design_Master_Protocol_synopsis_Final_18022020. pdf), was used to classify participants into the following: uninfected (0), mild (1-4), severe (5-7) or death (8).
GLIPLH2 analysis. A GLIPH2 CD8 + TCR input file was created from the following datasets: COMBAT 10× paired-chain single-cell and bulk TCR from all available participants 14 ; pentamer-sorted NP 105-113 -B*07:02-specific TCR sequences and clonally expanded cells used to test functional avidity processed using MiXCR (as described previously); and NP 105-113 -B*07:02-specific TCR sequences from the Lineburg and Nguyen datasets 7,10 . Clonotypes were defined as having a unique combination of CDR3β amino acid sequence, TRBV gene, TRBJ gene and CDR3α amino acid sequence. Where no or multiple CDR3α sequences were available for a cell, a not available (NA) value was used for the CDR3α field in accordance with GLIPH2 input guidelines. For each clonotype, additional information indicating dataset origin was appended as part of the 'condition' field. For the 10× COMBAT dataset, CD8 + clonotypes were distinguished from CD4 + clonotypes based on the multimodal classification of cells within each clone.
A matching GLIPH2 participant HLA input file was created using COMBAT formal HLA-typing data and, where no formal typing was available, from imputed HLA typing 3,14 , in addition to published HLA data relating to the Lineburg and Nguyen datasets 7,10 .
Convergence groups from this file were further categorized as being associated with or lacking association with HLA-B*07:02 based on having a GLIPH2 HLA score <0.05 or ≥0.05, respectively. Only clonotypes belonging to a HLA-B*07-associated convergence group, which were from participants known to have a HLA-B*07:02 allele, were deemed to be HLA-B*07:02-positive TCRs. Any clonotypes from convergence groups lacking HLA-B*07:02 association, but belonging to patients with a HLA-B*07:02 allele, were deemed ambiguous and excluded from the HLA-B*07:02-negative clonotype set.
Similarity between prepandemic and convalescent COVID-19 TCRs. NP 105-113specific TCRs from prepandemic individuals (predicted from the COMBAT dataset or experimentally defined by the Lineburg and Nguyen datasets 7,10 ) were compiled to form a single list of sequences (237 TCRs). Similarity scores were calculated from pairwise comparisons between each CDR3β sequence from the prepandemic/healthy list and each CDR3β sequence from 85 unique clonotypes of 4 convalescent patients with COVID-19 (clonotype defined per patient, TRBV gene usage and CDR3β sequence). A score of 1 indicates total similarity whereas a score of 0 is total dissimilarity. Each score was plotted on a box plot using ggplot2. Fig. 2 | TCR clonotypes for single cells and T cell clones and eC50 derivation. (a) Flow diagrams to show T cell clonotypes (defined as TrBV gene usage and CDr3β sequence) between SmartSeq2 sequenced ex vivo single cells (ex vivo single T cells column) and bulk TCr sequenced T cell clones grown in vitro culture (T cell clones column). Panel on the left shows grouped TCrs from patients with mild disease, panel on the right for patients with severe disease. (b) upper panel: IFn-γ ELISPOT assay for representative high and low functional avidity clones for each patient in blue and red respectively (C-COV19-038 only has low functional avidity clone shown). Lower panels: high functional avidity clone from C-COV19-45 and low functional avidity clone from C-COV-038 with example of EC50 derivation.