Main

Children with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections typically have mild disease3,4, but can develop a rare life-threatening post-infectious complication known as MIS-C1,2. MIS-C presents with a distinctive inflammatory signature indicative of altered innate immune responses5,6, including dysregulation of the mitochondrial antiviral signalling (MAVS) protein pathway7. Aberrant adaptive immunity is also involved, with multiple MIS-C-associated autoantibodies reported8,9,10,11,12. Furthermore, T cell signatures have also been associated with development of MIS-C13,14,15,16, which are accompanied by autoimmune-associated B cell expansions8. Some autoimmune diseases have been shown to involve tandem cross-reactive B cell and T cell responses. In multiple sclerosis, for example, cross-reactive B cells and T cells have been shown to respond to Epstein–Barr virus protein (EBNA1) and antigens in the human nervous system17,18,19. Decades of research into paraneoplastic autoimmune encephalitis has also demonstrated that autoreactive B cells and T cells can cause disease through coordinated targeting of a shared intracellular antigen and, in certain cases, a shared epitope20,21,22,23,24,25,26. Despite intense interest, a pathophysiological link between SARS-CoV-2 and MIS-C remains enigmatic, and identification of disease-specific autoantigens remains incompletely explored. Here children previously infected with SARS-CoV-2 with (n = 199) and without (n = 45) MIS-C were enrolled and comprehensively evaluated for differential autoreactivity to the entire human and SARS-CoV-2 proteome. Patients with MIS-C were found to have both cross-reactive antibodies and T cells targeting an epitope motif shared by the viral nucleocapsid protein and human SNX8, a protein involved in MAVS antiviral function27. These findings suggest that many cases of MIS-C may be triggered by molecular mimicry and could provide a framework for identifying potential cross-reactive epitopes in other autoimmune and inflammatory diseases with predicted viral triggers such as Kawasaki disease28, type 1 diabetes mellitus (T1DM)29 and multiple sclerosis.

Patients with MIS-C have a distinct set of autoreactivities

To explore the hypothesis that MIS-C is driven by an autoreactive process, we evaluated the proteome-wide autoantibody profiles of children with MIS-C (n = 199) and children convalescing following asymptomatic or mild SARS-CoV-2 infection without MIS-C (n = 45, hereafter referred to as ‘at-risk controls’) using our custom phage immunoprecipitation and sequencing (PhIP-seq)30 library, which has previously been used to define novel autoimmune syndromes and markers of disease for various conditions12,24,25,31,32,33. Given the inherently heterogeneous nature of antibody repertoires among individuals34, the identification of disease-associated autoreactive antigens requires the use of large numbers of cases and controls12. To minimize spurious hits, this study includes substantially more patients with MIS-C and controls than similar, previously published studies8,9,10,12 (Fig. 1a). Clinical characteristics of this cohort are described in Extended Data Table 1.

Fig. 1: Autoantigens distinguish MIS-C from at-risk controls.
figure 1

a, Design of the PhIP-seq experiment comparing patients with MIS-C (n = 199) and at-risk controls (n = 45; children with SARS-CoV-2 infection at least 5 weeks before sample collection without symptoms of MIS-C). Schematics in panel a were created using BioRender (https://www.biorender.com). b, Venn diagram highlighting the number of autoantigens identified with statistically significant PhIP-seq enrichment (‘enrichment set’: grey circle; P < 0.01 on one-sided Kolmogorov–Smirnov test with false discovery rate correction) and autoantigens identified, which contribute to a logistic regression classifier of MIS-C relative to at-risk controls (‘classifier set’: purple circle). There are 35 autoantigens present in both the classifier set and the enrichment set (pink; union of the Venn diagram) of which 30 are exclusive to MIS-C and referred to as the ‘MIS-C set’ (no two controls have low reactivity as defined by the fold-change (FC) signal over the mean of protein A/G beads only (FC > mock-IP) of 3 or greater, and no single control has high reactivity defined as FC > mock-IP greater than 10). LR, logistic regression. c, Receiver operating characteristic curve for the logistic regression classifier showing upper and lower bounds of performance through 1,000 iterations. d, Bar plots with error bars showing logistic regression coefficients for the top 10 autoantigens across 1,000 iterations. The whiskers extend to 1.5 times the interquartile range (IQR) from the quartiles. The boxes represent the IQR, and the centre lines represent the median. e, Hierarchically clustered (Pearson) heatmap showing the PhIP-seq enrichment (FC > mock-IP) for the 30 autoantigens in the MIS-C set in each patient with MIS-C and each at-risk plasma control.

Source Data

For a given set of samples, PhIP-seq can yield dozens to thousands of differential enrichments of phage-displayed peptides. Here logistic regression machine learning was used as an initial unbiased measure of how accurately a set of differentially enriched peptides could classify people with MIS-C and controls—an approach that has been used to classify people with autoimmune polyglandular syndrome type 1 using PhIP-seq data12. In all, 107 proteins had logistic regression coefficients greater than zero (‘classifier set’; Fig. 1b). As this is an unbalanced dataset with a random accuracy less than 50%, we also generated a receiver operating characteristic (ROC) curve. ROC analysis iterated 1,000 times and yielded an average area under the curve (AUC) of 0.94 (Fig. 1c). Examination of the logistic regression coefficients associated with MIS-C revealed the largest contributions from peptides derived from the ETS repressor factor-like (ERFL), sorting nexin 8 (SNX8) and KDEL endoplasmic reticulum protein retention receptor 1 (KDELR1) coding sequences (Fig. 1d).

In parallel, a Kolmogorov–Smirnov test was used to define a set of 661 autoreactivities statistically enriched after false discovery rate adjustment for multiple comparisons (q < 0.01; ‘enrichment set’). To avoid false positives, the intersection of the classifier set and enrichment set were considered further. Of these 35 hits, peptides derived from 30 different proteins satisfied an additional set of conservative criteria, requiring that none was enriched (fold change over mock-immunoprecipitation (IP) of more than 3) in more than a single control, or was enriched more than 10-fold in any control (‘MIS-C set’; Fig. 1e).

Previously reported MIS-C autoantibodies

To date, at least 34 autoantigen candidates have been reported to associate with MIS-C8,9,10,12. However, we found that only UBE3A (a ubiquitously expressed ubiquitin protein ligase) was differentially enriched in our MIS-C dataset, whereas the remaining 33 were present in a similar proportion of cases with MIS-C and at-risk controls (Extended Data Fig. 1a). Autoreactivity to UBE3A was independently identified in this study as part of both the classifier and the enrichment sets, but was not included in the final MIS-C set due to the low positive signal present in two controls.

In addition, autoantibodies to the receptor antagonist IL-1RA have been previously reported in 13 of 21 (62%) patients with MIS-C11. In this cohort, anti-IL-1RA antibodies were detected by PhIP-seq (z score > 6 over at-risk control) in six patient samples. To further examine immune reactivity to full-length IL-1RA, sera from 196 of the 199 patients in this study were used to immunoprecipitate [35S]-methionine-radiolabelled IL-1RA (radioligand-binding assay (RLBA)). Positive immunoprecipitation of IL-1RA (defined as more than 3 s.d. above mean of controls) was found in 39 of 196 (19.9%) patients with MIS-C. However, many patients with MIS-C were treated with intravenous immunoglobulin (IVIG), a blood product shown to contain autoantibodies35. After removing samples from patients treated with IVIG (61 remaining), the difference between samples from patients with MIS-C (5 of 61, 8.2%) and at-risk controls (1 of 45, 2.2%) was not significant (P = 0.299; Extended Data Fig. 1b).

MIS-C autoantigens lack tissue-specific associations with clinical phenotypes

Consistent with previous MIS-C reports1,5, this cohort was clinically heterogeneous (Extended Data Table 2). To determine whether specific phenotypes, including myocarditis and the requirement of vasopressors, might be associated with specific autoantigens present in the MIS-C set, tissue expression levels were assigned to each autoantigen36 (Human Protein Atlas; https://proteinatlas.org), including the amount of expression in cardiomyocytes and the cardiac endothelium. The PhIP-seq signal for patients with MIS-C with a particular phenotype was compared with those patients with MIS-C without the phenotype. Autoantigens with tissue specificity were not enriched in those patients with MIS-C with phenotypes involving said tissue. Similarly, autoantigens associated with myocarditis or vasopressor requirements did not correlate with increased cardiac expression (Extended Data Fig. 1c).

Orthogonal validation of PhIP-seq autoantigens

Peptides derived from ERFL, SNX8 and KDELR1 carried the largest logistic regression coefficients in the MIS-C classifier. The PhIP-seq results were orthogonally confirmed by RLBAs using full-length ERFL, SNX8 and KDELR1 proteins. Relative to at-risk controls, samples from patients with MIS-C significantly enriched each of the three target proteins (P < 1 × 10−10 for ERFL, SNX8 and KDELR1), consistent with the PhIP-seq assay (Extended Data Fig. 2a). Using only the RLBA data for these three proteins, MIS-C could be confidently classified (ROC with fivefold cross-validation; 1,000 iterations) from at-risk control sera with an AUC of 0.93, suggesting the potential for molecular diagnostic purposes (Extended Data Fig. 2b).

As noted, IVIG was administered to 138 of the 199 patients with MIS-C before sample collection and was absent from all 45 at-risk controls. The autoreactivity to the ERFL, SNX8 and KDELR1 proteins from the 61 patients with MIS-C who had not been treated with IVIG before sample collection were compared with the at-risk controls. In contrast to IL-1RA, the differential enrichment of these three proteins remained significant (P = 6.69 × 10−10, P = 6.26 × 10−5 and P = 0.0001, respectively), suggesting that autoreactivity to ERFL, SNX8 and KDELR1 proteins was not confounded by IVIG treatment (Extended Data Fig. 2c).

Independent MIS-C cohort validation

To further test the validity of these findings, an independent validation cohort consisting of samples from 24 different patients with MIS-C and 29 children with severe acute COVID-19 was evaluated (acquired via ongoing enrolment of the Overcoming COVID-19 study; Extended Data Table 3). Using RLBAs with full-length ERFL, SNX8 and KDELR1 proteins, we found that all three target proteins were significantly enriched compared with both the at-risk controls (P = 0.00022, P = 3.68 × 10−5 and P = 2.36 × 10−5, respectively) and the patients with severe acute COVID-19 (P = 0.0066, P = 0.00735 and P = 0.00114, respectively; Extended Data Fig. 2d). A logistic regression model, trained on the original cohort, classified MIS-C from at-risk controls with an AUC of 0.84, and from severe acute paediatric COVID-19 with an AUC of 0.78 (Extended Data Fig. 2e). This suggests that autoreactivity to ERFL, SNX8 and KDELR1 is a significant feature of MIS-C that is separable from SARS-CoV-2 exposure and severe acute paediatric COVID-19.

MIS-C autoantibodies target a single epitope within the SNX8 protein

SNX8 is a protein that is 456 amino acids and belongs to a family of sorting nexins involved in endocytosis, endosomal sorting and signalling37. Publicly available expression data36 (Human Protein Atlas) show that SNX8 is widely expressed across various tissues including the brain, heart, gastrointestinal tract, kidneys and skin, with the highest expression in undifferentiated cells and immune cells. Previous work has associated SNX8 with host defence against RNA viruses27. ERFL is a poorly characterized 354-amino acid protein. A survey of single-cell RNA sequencing (scRNA-seq) data36 (Human Protein Atlas) suggests enrichment in plasma cells, B cells and T cells in some tissues. Using a Spearman correlation in principal component analysis (PCA) space based on tissue RNA-seq data36 (Human Protein Atlas), SNX8 has the second closest expression pattern to ERFL compared with all other coding genes, with a correlation coefficient of 0.81. KDELR1 is a 212-amino acid endoplasmic reticulum–Golgi transport protein essential to lymphocyte development with low tissue expression specificity. All three proteins are predicted to be intracellular, suggesting that putative autoantibodies targeting these proteins are unlikely to be sufficient for disease pathology on their own. However, autoantibodies targeting intracellular antigens are often accompanied by autoreactive T cells specific for the protein from which that antigen was derived, and which targets cell types expressing the protein22,25,26,38. We selected SNX8 for further investigation, given its enrichment in immune cells and its putative role in regulating the MAVS pathway in response to RNA virus infection, a pathway implicated in MIS-C pathology7.

Full-length SNX8 is represented in this PhIP-seq library by 19 overlapping 49-mer peptides. For all but one patient sample, the peptide fragment spanning amino acid positions 25–73 was the most enriched in the PhIP-seq assay (Fig. 2a), suggesting a common autoreactive site. A sequential alanine scan was performed to determine the minimal immunoreactive peptide sequence (Fig. 2b; Methods). Using samples from six individuals with MIS-C, we determined that the critical region for immunoreactivity was a nonamer spanning positions 51–59 (PSRMQMPQG). Using the wild-type 49-amino acid peptide and the version with the critical region mutated to alanine, 182 of the 199 patients with MIS-C (insufficient sample for the remaining 17) and all 45 controls were assessed for immunoreactivity using a split-luciferase-binding assay (SLBA). We found that samples from 31 of 182 (17.0%) patients with MIS-C immunoprecipitated the wild-type fragment. Of these, 29 (93.5%) failed to immunoprecipitate the mutated peptide, suggesting a common shared autoreactive epitope among nearly all of the patients with MIS-C with anti-SNX8 antibodies (Extended Data Fig. 2f).

Fig. 2: Autoantibodies in patients with MIS-C target a single epitope within SNX8.
figure 2

a, PhIP-seq signal (reads per 100,000) for each patient with MIS-C (n = 199) and each at-risk control (n = 45) across each of the 19 bacteriophage-encoded peptide fragments, which together tile the full-length SNX8 protein. b, SLBA enrichments (normalized antibody indices) for each sequential alanine mutagenesis construct. Constructs were designed with 10 amino acid alanine windows (highlighted in purple) shifted by 5 amino acids until the entire immunodominant SNX8 region (SNX8 fragment 2) was scanned. Values are averages of six separate patients with MIS-C. The identified autoantibody epitope is bounded by vertical grey dotted lines.

Source Data

Patients with MIS-C have an altered antibody response to the SARS-CoV-2 nucleocapsid protein

To evaluate whether differences exist in the humoral immune response to SARS-CoV-2 infection in patients with MIS-C relative to at-risk controls, we repeated PhIP-seq with 181 of the original 199 patients with MIS-C and all 45 of the at-risk controls using a previously validated library specific for SARS-CoV-2 (ref. 39). To discover whether certain fragments were differentially enriched in either patients with MIS-C or at-risk controls, the enrichment of each phage encoded SARS-CoV-2 peptide (38 amino acids each) across all patients with MIS-C and at-risk controls was normalized to 48 healthy controls pre-COVID-19. Three nearly adjacent peptides derived from the SARS-CoV-2 nucleocapsid protein (fragments 5, 8 and 9) were significantly enriched (Kolmogorov–Smirnov test P < 0.0001 for each). The first peptide (fragment 5), spanning amino acids 77–114, was significantly enriched in the at-risk controls (representing the typical serological response in children), whereas the next two fragments (fragments 8 and 9), spanning amino acids 134–190, were significantly enriched in patients with MIS-C (Fig. 3a,b). The most differentially reactive region of the SARS-CoV-2 nucleocapsid protein in patients with MIS-C (fragment 8) was termed the MIS-C-associated domain of SARS-CoV-2 (MADS). The PhIP-seq results were orthogonally confirmed using an SLBA measuring the amount of MADS peptide immunoprecipitated with samples from 16 individuals, including 11 patients with MIS-C and 5 at-risk controls (Fig. 3c). To precisely map the minimal immunoreactive region of MADS in MIS-C samples, peptides featuring a sliding window of ten alanine residues were used as the immunoprecipitation substrate for SLBAs, run in parallel with the SNX8 alanine scanning peptides using sera from three patients with MIS-C (Fig. 3d). The critical regions identified here in both SNX8 and MADS were highly similar, represented by the (ML)Q(ML)PQG motif (Fig. 3e).

Fig. 3: Antibodies from patients with MIS-C preferentially target a distinct region of the SARS-CoV-2 nucleocapsid protein.
figure 3

a, Relative PhIP-seq signal (FC over the mean) of 48 controls who are pre-COVID-19 (FC > pre-COVID-19) in patients with MIS-C (n = 181) and at-risk controls (n = 45) using a custom phage display library expressing the entire SARS-CoV-2 proteome to different regions of SARS-CoV-2. Only regions with a mean antibody signal of more than 1.5-fold above pre-COVID-19 controls are shown. Antigenicity (sum of the mean FC > pre-COVID-19 in MIS-C and at-risk controls) are represented by darker shades. The length of the bars represents the statistical difference in signal between MIS-C and at-risk controls to a particular region (−log10 of two-sided Kolmogorov–Smirnov test P values), with upward deflections representing enrichment in MIS-C versus at-risk controls, and downward deflections representing less signal in MIS-C. The asterisk indicates the differentially reactive region of the nucleocapsid (N) protein. b, Bar plots showing the PhIP-seq signal (FC > pre-COVID-19) across the specific region of the SARS-CoV-2 nucleocapsid protein (fragments 4–9) with the most divergent response in MIS-C samples (n = 181) relative to at-risk controls (n = 45), compared using a two-sided Kolmogorov–Smirnov test (exact P values are shown in the figure). The amino acid sequence of the region with the highest relative enrichment in MIS-C is highlighted in green and referred to as MADS. c, Strip plots and box plots showing MADS SLBA enrichments (normalized antibody indices) in patients with MIS-C (n = 11) relative to at-risk controls (n = 5). d, SLBA signal (normalized antibody indices) for full sequential alanine mutagenesis scans within the same three individuals for SNX8 (left) and MADS (right). Each identified epitope is bounded by black vertical dotted lines. e, Multiple sequence alignment of SNX8 and MADS epitopes with the amino acid sequence for the similarity region shown (for the text in colour, biochemically similar is in orange, and identical is in red). For the box plots (b,c), the whiskers extend to 1.5 times the IQR from the quartiles. The boxes represent the IQR, and the centre lines represent the median.

Source Data

Patients with MIS-C have significantly increased SNX8 autoreactive T cells

In other autoimmune diseases, autoantibodies often arise to intracellular targets, yet the final effectors of cellular destruction are autoreactive T cells22,26,40. Given evidence that certain subsets of MIS-C are associated with HLA16, and that SNX8 is an intracellular protein, we hypothesized that patients with MIS-C with anti-SNX8 antibodies may, in addition to possessing SNX8 autoreactive B cells, also possess autoreactive T cells targeting SNX8-expressing cells. To test this hypothesis, T cells from nine patients with MIS-C (eight from SNX8 autoantibody-positive patients and one who was SNX8 autoantibody negative) and ten at-risk controls (chosen randomly) were exposed to a pool of 15-mer peptides with 11-amino acid overlaps tiling the full-length human SNX8 protein. T cell activation was measured by an activation-induced marker assay, which quantifies upregulation of three cell activation markers: OX40, CD69 and CD137 (ref. 41). The percent of T cells activated in response to SNX8 protein was significantly higher in patients with MIS-C than in controls (P = 0.00126). Using a positive cut-off of 3 s.d. above the mean of the controls, 7 of the 9 (78%) patients with MIS-C were positive for SNX8-expressing autoreactive T cells, whereas 0 of 10 (0%) controls met these criteria (Fig. 4a). With respect to CD4+ and CD8+ subgroups, there was an increased signal in patients with MIS-C compared with controls, which did not meet significance (P = 0.0711 and P = 0.0581, respectively; Extended Data Fig. 3a). The patient with MIS-C who was seronegative for the SNX8 autoantibody was also negative for SNX8 autoreactive T cells.

Fig. 4: SNX8 autoreactive CD8+ T cells in patients with MIS-C are cross-reactive to the nucleocapsid protein.
figure 4

a, Strip plots and box plots showing the distribution of T cells activated in response to either vehicle (culture media + 0.2% DMSO) or the SNX8 peptide pool (SNX8 peptide + culture media + 0.2% DMSO) in patients with MIS-C (n = 9) and controls (n = 10). The relative signal was compared using a two-sided Mann–Whitney U-test (exact P values are shown in the figure). The box plot whiskers extend to 1.5 times the IQR from the quartiles, the boxes represent the IQR, and the centre lines represent the median. The dashed line is 3 s.d. above the mean of the controls in the SNX8 pool condition. b, TCRdist similarity network of 48 unique, paired TCRαβ sequences (n = 259 sequences) obtained from four patients with MIS-C. CD8+ T cells were sorted from PBMCs directly ex vivo or after 10 days of peptide expansion and staining with A*02:01 or A*02:06 HLA class I tetramers loaded with MADS (LQLPQGITL) and SNX8 (MQMPQGNPL) peptides. Each node represents a unique TCR clonotype. Edges connect nodes with a TCRdist score of less than 150. The dashed lines surround TCR similarity clusters. The node size corresponds to the T cell clone size. Nodes are coloured based on the HLA experiment type (left) or patient (right). TCRs selected for further testing are numbered TCR 1–8. The convergent node is circled in green. c, Specificity of putative cross-reactive TCRs expressed in Jurkat-76 cells by HLA-A*02:01 or HLA-A*02:06 tetramers loaded with MADS (LQLPQGITL) and SNX8 (MQMPQGNPL) peptides. Jurkat-76 (TCR-null) cells were used as tetramer background staining controls. The gate values indicate the frequency of MADS–APC+ and/or SNX8–BV421+ cells as the percentage of the total PE+ cells (combination staining with MADS–PE and SNX8–PE tetramers). TCRs with confirmed cross-reactivity are indicated in red. Outliers are shown. Flow plots are representative of two independent evaluations. d, Summary of TCR sequencing results of the eight TCRs tested.

Source Data

HLA type A*02 is more likely to present the shared epitope

MIS-C has been associated with HLA alleles A*02, B*35 and C*04 (ref. 16). The Immune Epitope Database and Analysis Resource (https://IEDB.org)42 was used to rank the HLA class I (HLA-I) peptide presentation likelihoods for both SNX8 and SARS-CoV-2 nucleocapsid protein with respect to the MIS-C-associated HLA alleles. The distribution of predicted HLA-I-binding scores for nucleocapsid protein and SNX8 fragments matching the (ML)Q(ML)PQG SNX8/MADS motif relative to fragments lacking a match was compared. For HLA-A*02, predicted HLA-I binding was significantly higher (P = 8.78 × 10−10 for nucleocapsid protein; P = 0.0112 for SNX8) for fragments containing the putative autoreactive motif. There was no statistical difference for HLA-B*35 and HLA-C*04 predictions (Extended Data Fig. 3b,c). Of note, of the seven patients with MIS-C with SNX8 autoreactive T cells, at least five were positive for HLA-A*02 (Extended Data Fig. 3a). To experimentally validate HLA-I-binding predictions to SNX8 and MADS peptides, we measured peptide–HLA (pHLA) monomer stability using a β2 microglobulin (β2m) fold test, which is a proxy for pHLA-binding affinity in which anti-β2m staining reports on the strength of the pHLA complex43. SNX8 (MQMPQGNPL) and MADS (LQLPQGITL) peptides were loaded onto unfolded HLA-A*02:01, HLA-A*02:06 or HLA-B*35:01 monomers and stained with an anti-β2m fluorescent antibody. Consistent with the IEDB rankings, both HLA-A*02 alleles bound SNX8 and MADS peptides, with HLA-A*02:06 exhibiting the highest pHLA complex stability (Extended Data Fig. 3d).

T cells from patients with MIS-C are cross-reactive to the SNX8 and nucleocapsid protein similarity regions

Given the prediction that HLA types associated with MIS-C preferentially display peptides containing the similarity regions for both SNX8 and the SARS-CoV-2 nucleocapsid protein, we sought to determine whether cross-reactive T cells were present and whether they were associated with MIS-C. We stimulated peripheral blood mononuclear cells (PBMCs) from three patients with MIS-C and three at-risk controls with peptides from either the SNX8 similarity region (MQMPQGNPL) or the MADS similarity region (LQLPQGITL) for 7 days to enrich for CD8+ T cells reactive to these epitopes. We then built differently labelled HLA-I tetramers loaded with either the SNX8 or MADS peptides and measured binding to T cells (Extended Data Fig. 4a). We detected cross-reactive CD8+ T cells, which bound both peptide epitopes, in all three patients with MIS-C, whereas no cross-reactive CD8+ T cells were observed in at-risk controls (Extended Data Fig. 4b).

As SNX8-responsive T cells were observed in patients with MIS-C, we next asked whether the region of SNX8 similar to the SARS-CoV-2 MADS region was sufficient to activate patient T cells. A pool of 20 10-mer peptides with 9-amino acid overlaps centred on the target motif from SNX8 (collectively spanning amino acids 44–72) was used to stimulate PBMCs from two patients with MIS-C and four at-risk controls. Both patients with MIS-C had activation of T cells, whereas none of the four controls had T cell activation (Extended Data Fig. 4c).

Identification of ex vivo cross-reactive T cell receptors

Having determined that patients with MIS-C, but not controls, contained putative SNX8/MADS cross-reactive CD8+ T cells, we next sought to identify T cell receptor (TCR) sequences with specificity for both the SARS-CoV-2 MADS and the host SNX8 epitopes. To do this, PBMCs were obtained during the first 72 h of hospital admission from four study participants with HLA-A*02 and confirmed MIS-C (one individual previously identified as having putative cross-reactive T cells, and three new patients). Given that MIS-C PBMCs represent a scarce resource, we chose to expand one aliquot of PBMCs from each of the four participants (distinct from our previous peptide expansion protocol; see Methods) to maximize the chances of isolating putative cross-reactive TCRs. Although the frequency of ex vivo autoantigen-specific CD8+ T cells are extraordinarily low in peripheral blood, even for bona fide T cell-mediated autoimmune diseases such as T1DM38 and multiple sclerosis44,45, we nevertheless utilized the remaining PBMCs from each participant for direct ex vivo analysis without previous expansion. To isolate the antigen-specific TCRs, participant cells (both ex vivo and following peptide expansion) were stained using the same tetramer-labelling strategy, which previously identified the putative cross-reactive TCRs (Extended Data Fig. 4a); any cell exhibiting binding to at least two peptide-loaded tetramers was individually sorted and full-length paired TCRα and TCRβ sequences were determined. This resulted in 259 complete TCR sequences, comprising 30 and 18 unique T cell clones from the ex vivo and peptide expansion experiments, respectively. A complete list of TCR sequences is provided (Fig. 4 source data).

Next, we sought to validate the specificity of putative SNX8/MADS cross-reactive TCRs identified from the tetramer sorting, and further analyse features of the recovered TCRs. Because clusters of similar TCRs tend to recognize similar peptide antigens, a TCR similarity network was constructed from all 259 full-length TCR sequences using a previously established TCR distance metric (TCRdist)46,47 (Fig. 4b and Extended Data Fig. 4d). In two of the four patients, we identified unique populations of clonally expanded T cells expressing putative cross-reactive TCRs directly ex vivo, whereas each of the four patients had at least one ex vivo putative cross-reactive TCR (Fig. 4b). To confirm the specificity of the TCRs identified in our tetramer sorting, we selected eight TCR sequences for additional validation and generated individual cell lines that stably expressed one TCR of interest (Extended Data Fig. 5a). These Jurkat-TCR+ cell lines were tetramer stained, and cross-reactivity was confirmed in three of the Jurkat-TCR+ cell lines (TCR 1, 7 and 8; Fig. 4c). Of these validated cross-reactive TCRs, two were obtained from ex vivo PBMCs from patients with MIS-C including TCR 7, which was clonally expanded. The minimum ex vivo frequency of TCR 7 alone was more than 1 in 25,000 (6 of 140,035) circulating CD8+ T cells. The two cross-reactive TCRs obtained from the ex vivo isolation were derived from the same participant, utilize the same TRAV gene (TRAV1-2) with identical CDR3α sequences and clustered with three additional sequences in the TCRdist space, one of which was also clonally expanded, suggesting that this patient had an active expansion of a large cluster of SNX8/MADS cross-reactive CD8+ T cells (Fig. 4d). Furthermore, we note a cluster of two similar TCRs obtained from ex vivo sampling of different participants (patients 2 and 4) with different HLA types (‘convergent node’; circled in green in Fig. 4b). Although these putative cross-reactive TCRs were not evaluated further, the cluster suggests that TCR specificities to these epitopes may converge across individuals.

The remaining five Jurkat-TCR+ cell lines (TCR 2–6) exhibited single specificity to the MADS tetramer with four of five coming from the peptide expansion. To evaluate possible interference between tetramers, which can arise when pHLA–TCR-binding affinities differ, Jurkat-TCR+ cell lines were stained with individual tetramers. The results confirm that four of these TCRs are indeed reactive only to MADS (Extended Data Fig. 5b). However, TCR 2, although showing strong binding preference to MADS, also bound the individual SNX8 tetramer, suggesting that the higher affinity for MADS may outcompete binding to the SNX8 tetramer in some cases. This observation is in line with the notion that autoreactive cross-reactive TCRs with lower relative affinities to autoantigens may escape thymic negative selection. Finally, because the original tetramer experiments were based on an early 2020 SARS-CoV-2 minor variant sequence (LQLPQGITL), all eight Jurkat-TCR+ cell lines were also stained with HLA tetramers loaded with the SARS-CoV-2 Wuhan MADS sequence (LQLPQGTTL). In all cases, the Jurkat-TCR+ cells bound the Wuhan MADS tetramer, consistent with the notion that T cells encoding these and other similar TCRs may be capable of responding to multiple SARS-CoV-2 strains (Extended Data Fig. 5c).

RNA expression profile of SNX8 during SARS-CoV-2 infection

As previously discussed, SNX8 is expressed across multiple tissues, but is highest in immune cells, consistent with its role in defending against RNA viruses via recruitment of MAVS27. To further investigate the potential impact of combined B cell and T cell autoimmunity to SNX8 following SARS-CoV-2 infection, we used scRNA-seq to analyse SNX8 expression in PBMCs from patients with severe, mild or asymptomatic SARS-CoV-2 infection or influenza infection and uninfected healthy controls48. Following SARS-CoV-2 infection, SNX8 had the highest mean expression in classical and non-classical monocytes and B cells (Extended Data Fig. 6a,b) and was elevated in individuals infected with SARS-CoV-2 compared with those who were uninfected (Extended Data Fig. 6c). Within myeloid lineage cells, SNX8 expression correlated with MAVS expression and OAS1 and OAS2 (which encode two known regulators of the MAVS pathway implicated in MIS-C pathogenesis7) expression (Extended Data Fig. 6d). Conversely, SNX8 expression is inversely correlated to SARS-CoV-2 infection severity. This follows a similar pattern to OAS1 and OAS2. However, unlike OAS1, OAS2 and MAVS, SNX8 is preferentially expressed during SARS-CoV-2 infection compared with influenza virus infection (Extended Data Fig. 6e).

Discussion

The SARS-CoV-2 pandemic largely spared children from severe disease. One rare but notable exception is MIS-C, an enigmatic and life-threatening syndrome. Previous studies have surfaced numerous associations, but have failed to identify a direct mechanistic link between SARS-CoV-2 and MIS-C. In this study, 199 samples from patients with MIS-C and 45 paediatric at-risk controls were analysed using customized human and SARS-CoV-2 proteome PhIP-seq libraries. Targeted follow-up experiments from these assays ultimately revealed that patients with MIS-C preferentially had antibodies targeting the epitope motif (ML)Q(ML)PQG shared by both the SARS-CoV-2 nucleocapsid protein and the human protein SNX8. Cross-reactive CD8+ T cells targeting both regions were detected in patients with MIS-C, but not in controls, suggesting that these CD8+ T cells may contribute to immune dysregulation through the inappropriate targeting of immune cells expressing SNX8. We found evidence that the (ML)Q(ML)PQG epitope motif elicits both B cell and T cell reactivity; further study of this epitope convergence is warranted.

These findings help to connect several important known aspects of MIS-C pathophysiology and draw parallels to other diseases in which exposure to a new antigen leads to autoimmunity, such as paraneoplastic autoimmune disease or cross-reactive epitopes between Epstein–Barr virus and host proteins in multiple sclerosis17,18,19,22,26. An expansion of T cells expressing TCRβ variable gene 11-2 (TRBV11-2) has been shown in MIS-C8,15,16; however, the underlying driver remains unknown. Although we did not observe an overrepresentation of TRBV11-2 in our putative cross-reactive TCR dataset, we did identify two expanded TRBV11-2+ clones (n = 6 and n = 2) sequenced directly from ex vivo samples. Although SNX8 is a relatively understudied protein, it has been linked to the function and activity of MAVS27. Dysregulation of the MAVS antiviral pathway, by inborn errors of immunity, has been shown to underlie certain cases of MIS-C7. The most straightforward connection linking MIS-C to SNX8 may be through an inappropriate autoimmune response against tissues with elevated MAVS pathway expression. These results are the first to directly link the initial SARS-CoV-2 infection and the subsequent development of MIS-C. We propose that MIS-C may be the result of multiple uncommon events converging. The initial insult is probably the formation of a combined B cell and T cell response that preferentially targets a particular motif within the MADS region of the SARS-CoV-2 nucleocapsid protein. In a subset of individuals, these B cell and T cell responses cross-react to the self-protein SNX8. This cross-reactive motif has strong binding characteristics for the MIS-C-associated HLA-A*02 (ref. 16), further indicating that this may be an important risk factor in the development of MIS-C.

Using conservative criteria (3 s.d. greater than controls by targeted immunoprecipitation of the epitope-containing peptide), at least 17% of sera from patients with MIS-C are autoreactive for SNX8; however, approximately 37% of sera from patients with MIS-C yielded detectable enrichment compared with controls in the entire dataset. Because we only tested for a single epitope target, we are unable to determine the upper limits of the in vivo frequency of cross-reactive CD8+ T cells in patients with MIS-C. Our results suggest that the frequency of these cross-reactive CD8+ T cells is within the range of 1 in 10,000–100,000 CD8+ T cells. This substantially exceeds the frequency of antigen-specific autoreactive CD8+ T cells found in peripheral circulation in bona fide T cell-mediated autoimmune diseases such as T1DM38 and multiple sclerosis44,45. Similar to T1DM, the autoreactive and cross-reactive CD8+ T cells in patients with MIS-C may be found at far greater abundance within peripheral tissues known to be affected by the disease38. Even accounting for these limitations, our results describe a subset of MIS-C, indicating that other mechanisms probably exist. Antibodies to ERFL are present in many children with MIS-C who do not have autoreactivity to SNX8, and ERFL has a highly similar tissue RNA expression profile as SNX8 (second-most similar among all known proteins; Human Protein Atlas)36. If autoreactive T cells to ERFL indeed exist, they would be predicted to engage a nearly identical set of cells and tissues. It is important to also consider that MIS-C prevalence has rapidly decreased as an increasing number of children have developed immunity through vaccination and natural SARS-CoV-2 infection. We speculate that perhaps this could be related to the strong deviation of the anti-SARS-CoV-2 immune response away from the critical MADS region of the nucleocapsid protein that we have identified, to other major epitopes such as those in the spike protein through vaccination and past infection49. Supporting this notion is recent CDC surveillance, which noted that more than 80% (92 of 112) of individuals with MIS-C in 2023 were in unvaccinated children (but vaccine eligible), and that the majority of children who developed MIS-C despite previous vaccination probably had waned immunity50.

MIS-C is complex, and more work will be required to fully understand this syndrome. The results of this study, and specifically the development of combined cross-reactive B cells and T cells, build on other notable examples of molecular mimicry; however, the mechanisms by which the presence of a cross-reactive epitope forces a break in tolerance remain unclear. Our results shed light on how one post-infectious disease (MIS-C) develops, yielding insights that may help better explain, diagnose and ultimately treat a range of additional conditions associated with infections.

Methods

Patients

Patients were recruited through the prospectively enrolling multicentre Overcoming COVID-19 and Taking on COVID-19 Together study in the USA. All patients meeting clinical criteria were included in the study, and therefore no statistical methods were used to predetermine sample size and no blinding or randomization of subjects occurred. The study was approved by the central Boston Children’s Hospital Institutional Review Board (IRB) and reviewed by IRBs of participating sites with CDC IRB reliance. A total of 292 patients consented and were enrolled into one of the following independent cohorts between 1 June 2020 and 9 September 2021: 223 patients hospitalized with MIS-C (199 in the primary discovery cohort and 24 in a separate subsequent validation cohort), 29 patients hospitalized for COVID-19 in either an intensive care or step-down unit (referred to as ‘severe acute COVID-19’ in this study) and 45 outpatients (referred to as ‘at-risk controls’ in this study) post-SARS-CoV-2 infections associated with mild or no symptoms. The demographic and clinical data are summarized in Extended Data Tables 13. The 2020 US CDC case definition was used to define MIS-C51. All patients with MIS-C had positive SARS-CoV-2 serology results and/or positive SARS-CoV-2 test results by reverse transcriptase quantitative PCR. All patients with severe COVID-19 or outpatient SARS-CoV-2 infections had a positive antigen test or nucleic acid amplification test for SARS-CoV-2. For outpatients, samples were collected from 36 to 190 days after the positive test (median of 70 days after a positive test; interquartile range of 56–81 days). For use as controls in the SARS-CoV-2-specific PhIP-seq, plasma from 48 healthy, pre-COVID-19 controls were obtained as deidentified samples from the New York Blood Center. These samples were part of retention tubes collected at the time of blood donations from volunteer donors who provided informed consent for their samples to be used for research.

DNA oligomers for SLBAs

DNA coding for the desired peptides for use in SLBAs were inserted into split luciferase constructs containing a terminal HiBiT tag and synthesized (Twist Biosciences) as DNA oligomers and verified by Twist Biosciences before shipment. Constructs were amplified by PCR using the 5′- AAGCAGAGCTCGTTTAGTGAACCGTCAGA-3′ and 5′-GGCCGGCCGTTTAAACGCTGATCTT-3′ primer pair.

For SNX8, the oligomers coded for the following sequences:

EADPPASDLPTPQAIEPQAIVQQVPAPSRMQMPQGNPLLLSHTLQELLA

AAAAAAAAAATPQAIEPQAIVQQVPAPSRMQMPQGNPLLLSHTLQELLA

EADPPAAAAAAAAAAEPQAIVQQVPAPSRMQMPQGNPLLLSHTLQELLA

EADPPASDLPAAAAAAAAAAVQQVPAPSRMQMPQGNPLLLSHTLQELLA

EADPPASDLPTPQAIAAAAAAAAAAAPSRMQMPQGNPLLLSHTLQELLA

EADPPASDLPTPQAIEPQAIAAAAAAAAAAQMPQGNPLLLSHTLQELLA

EADPPASDLPTPQAIEPQAIVQQVPAAAAAAAAAANPLLLSHTLQELLA

EADPPASDLPTPQAIEPQAIVQQVPAPSRMAAAAAAAAAASHTLQELLA

EADPPASDLPTPQAIEPQAIVQQVPAPSRMQMPQGAAAAAAAAAAELLA

EADPPASDLPTPQAIEPQAIVQQVPAPSRMQMPQGNPLLLAAAAAAAAA

For SARS-CoV-2 nucleocapsid protein, the oligomers coded for the following sequences:

ATEGALNTPKDHIGTRNPANNAAIVLQLPQGTTLPKGFYAEGSRGGSQA

AAAAAAAAAADHIGTRNPANNAAIVLQLPQGTTLPKGFYAEGSRGGSQA

ATEGAAAAAAAAAAARNPANNAAIVLQLPQGTTLPKGFYAEGSRGGSQA

ATEGALNTPKAAAAAAAAAANAAIVLQLPQGTTLPKGFYAEGSRGGSQA

ATEGALNTPKDHIGTAAAAAAAAAALQLPQGTTLPKGFYAEGSRGGSQA

ATEGALNTPKDHIGTRNPANAAAAAAAAAAGTTLPKGFYAEGSRGGSQA

ATEGALNTPKDHIGTRNPANNAAIVAAAAAAAAAAKGFYAEGSRGGSQA

ATEGALNTPKDHIGTRNPANNAAIVLQLPQAAAAAAAAAAEGSRGGSQA

ATEGALNTPKDHIGTRNPANNAAIVLQLPQGTTLPAAAAAAAAAAGSQA

ATEGALNTPKDHIGTRNPANNAAIVLQLPQGTTLPKGFYAAAAAAAAAA

DNA plasmids for RLBAs

For RLBAs, DNA expression plasmids under control of a T7 promoter and with a terminal Myc–DDK tag for the desired protein were utilized. For ERFL, a custom plasmid was ordered from Twist Bioscience in which a Myc–DDK-tagged full-length ERFL sequence under a T7 promoter was inserted into the pTwist Kan High Copy Vector (Twist Bioscience). Twist Bioscience verified a sequence-perfect clone by next-generation sequencing before shipment. Upon receipt, the plasmid was sequence verified by Primordium Labs. For SNX8, a plasmid containing the Myc–DDK-tagged full-length human SNX8 under a T7 promoter was ordered from Origene (RC205847) and was sequence verified by Primordium Labs upon receipt. For KDELR1, a plasmid containing the Myc–DDK-tagged full-length human KDELR1 under a T7 promoter was ordered from Origene (RC205880) and was sequence verified by Primordium Labs upon receipt. For IL1RN, a plasmid containing the Myc–DDK-tagged full-length human IL1RN under a T7 promoter was ordered from Origene (RC218518) and was sequence verified by Primordium Labs upon receipt.

Polypeptide pools for activation-induced marker assays

To obtain polypeptides tiling the full-length SNX8 protein, 15-mer polypeptide fragments with 11-amino acid overlaps were ordered from JPT Peptide Technologies and synthesized. Together, a pool of 130 of these polypeptides (referred to as the ‘SNX8 pool’) spanned all known translated SNX8 (the full-length 465-amino acid SNX8 protein, as well as a unique region of SNX8 isoform 3). A separate pool was designed to cover primarily the region of SNX8 with similarity to the SARS-CoV-2 nucleocapsid protein in high resolution (referred to as the ‘high-resolution epitope pool’). This pool contained 20 10-mers with 9-amino acid overlaps tiling amino acids 44–72 (IVQQVPAPSRMQMPQGNPLLLSHTLQELL) of the full-length SNX8 protein. The sequence of each of these 150 polypeptides was verified by mass spectrometry and purity was calculated by high-performance liquid chromatography (HPLC).

Peptides for tetramer assays

For use in loading tetramers, three peptides were ordered from Genemed Synthesis as 9-mers. LQLPQGTTL and LQLPQGITL correspond to the region of the SARS-CoV-2 nucleocapsid protein with similarity to human SNX8 in the ancestral sequence and a minor variant, respectively. This sequence was verified by mass spectrometry and purity was calculated as 96.61% by HPLC. The other sequence, MQMPQGNPL, corresponds to the region of human SNX8 protein with similarity to the SARS-CoV-2 nucleocapsid protein. This sequence was verified by mass spectrometry and purity was calculated as 95.83% by HPLC.

Human proteome PhIP-seq

Human proteome PhIP-seq was performed following our previously published vacuum-based PhIP-seq protocol12 (https://www.protocols.io/view/scaled-high-throughput-vacuum-phip-protocol-ewov1459kvr2/v1).

Our human peptidome library consists of a custom-designed phage library of 731,724 unique T7 bacteriophage each presenting a different 49-amino acid peptide on its surface. Collectively, these peptides tile the entire human proteome including all known isoforms (as of 2016) with 25-amino acid overlaps. Of the phage library, 1 ml was incubated with 1 μl of human serum overnight at 4 °C and immunoprecipitated with 25 μl of 1:1 mixed protein A and protein G magnetic beads (10008D and 10009D, Thermo Fisher). These beads were than washed, and the remaining phage–antibody complexes were eluted in 1 ml of Escherichia coli (BLT5403, EMD Millipore) at 0.5–0.7 OD and amplified by growing in a 37 °C incubator. This new phage library was then re-incubated with the serum from the same individual and the previously described protocol was repeated. DNA was then extracted from the final phage library, barcoded, PCR amplified and Illumina adaptors were added. Next-generation sequencing was performed using an Illumina sequencer (Illumina) to a read depth of approximately 1 million per sample.

Human proteome PhIP-seq analysis

All human peptidome analysis (except when specifically stated otherwise) was performed at the gene level, in which all reads for all peptides mapping to the same gene were summed, and 0.5 reads were added to each gene to allow inclusion of genes with zero reads in mathematical analyses. Within each individual sample, reads were normalized by converting to the percentage of total reads. To normalize each sample against background nonspecific binding, a fold change over mock-IP was calculated by dividing the sample read percentage for each gene by the mean read percentage of the same gene for the AG bead-only controls. This fold-change signal was then used for side-by-side comparison between samples and cohorts. Fold-change values were also used to calculate z scores for each patient with MIS-C compared with controls and for each control sample by using all remaining controls. These z scores were used for the logistic-regression feature weighting. In instances of peptide-level analysis, raw reads were normalized by calculating the number of reads per 100,000 reads.

SARS-CoV-2 proteome PhIP-seq

SARS-CoV-2 proteome PhIP-seq was performed as previously described39. In brief, 38 amino acid fragments tiling all open reading frames from SARS-CoV-2, SARS-CoV-1 and 7 other CoVs were expressed on T7 bacteriophage with 19-amino acid overlaps. Of the phage library, 1 ml was incubated with 1 μl of human serum overnight at 4 °C and immunoprecipitated with 25 μl of 1:1 mixed protein A and protein G magnetic beads (10008D and 10009D, Thermo Fisher). Beads were washed five times on a magnetic plate using a P1000 multichannel pipette. The remaining phage–antibody complexes were eluted in 1 ml of E. coli (BLT5403, EMD Millipore) at 0.5–0.7 OD and amplified by growing in 37 °C incubator. This new phage library was then re-incubated with the serum of the same individual and the previously described protocol was repeated for a total of three rounds of immunoprecipitations. DNA was then extracted from the final phage library, barcoded, PCR amplified and Illumina adaptors were added. Next-generation sequencing was then performed using an Illumina sequencer (Illumina) to a read depth of approximately 1 million per sample.

Coronavirus proteome PhIP-seq analysis

To account for differing read depths between samples, the total number of reads for each peptide fragment was converted to the number of reads per 100,000 (RPK). To calculate normalized enrichment relative to pre-COVID-19 controls (FC > pre-COVID-19), the RPK for each peptide fragment within each sample was divided by the mean RPK of each peptide fragment among all pre-COVID-19 controls. These FC > pre-COVID-19 values were used for all subsequent analyses as described in the text and figures.

RLBA

RLBAs were performed as previously described12,32. In brief, DNA plasmids containing full-length cDNA under the control of a T7 promoter for each of the validated antigens (see ‘DNA plasmids for RLBAs’ above) were verified by Primordium Labs sequencing. The respective DNA templates were used in the T7 TNT in vitro transcription/translation kit (L1170, Promega) using [35S]-methionine (NEG709A, PerkinElmer). Respective protein was column purified on Nap-5 columns (17-0853-01, GE Healthcare), and equal amounts of protein (approximately 35,000 counts per minute) were incubated overnight at 4 °C with 2.5 μl of serum or 1 μl of anti-Myc-positive control antibody (1:10 dilution; 2272S, Cell Signaling Technology). Immunoprecipitation was then performed on 25 μl of Sephadex protein A/G beads (4:1 ratio; GE17-5280-02 and GE17-0618-05, Sigma-Aldrich) in 96-well polyvinylidene difluoride filtration plates (EK-680860, Corning). After thoroughly washing, the counts per minute of immunoprecipitated protein was quantified using a 96-well Microbeta Trilux liquid scintillation plate reader (Perkin Elmer).

SLBA

SLBA was performed as previously described52. A detailed SLBA protocol is available on protocols.io (https://doi.org/10.17504/protocols.io.4r3l27b9pg1y/v1).

In brief, the DNA oligomers listed above (see ‘DNA oligomers for SLBAs’) were amplified by PCR using the primer pairs listed above (see ‘DNA oligomers for SLBAs’). Unpurified PCR product was used as input in the T7 TNT in vitro transcription/translation kit (L1170, Promega) and the Nano-Glo HiBit Lytic Detection System (N3040, Promega) was used to measure relative luciferase units of translated peptides in a luminometre. Equal amounts of protein (in the range of 2 × 106–2 × 107 relative luciferase units) were incubated overnight with 2.5 μl patient sera or 1 μl anti-HiBit-positive control antibody (1:10 dilution; CS2006A01, Promega) at 4 °C. Immunoprecipitation was then performed on 25 µl of Sephadex protein A/G beads (1:1 ratio; GE17-5280-02 and GE17-0618-05, Sigma-Aldrich) in 96-well polyvinylidene difluoride filtration plates (EK-680860, Corning). After thoroughly washing, luminescence was measured using the Nano-Glo HiBit Lytic Detection System (N3040, Promega) in a luminometre.

Activation-induced marker assay

PBMCs were obtained from ten patients with MIS-C and ten controls for use in the activation-induced marker assay. PBMCs were thawed, washed, resuspended in serum-free RPMI medium and plated at a concentration of 1 × 106 cells per well in a 96-well round-bottom plate. For each individual, PBMCs were stimulated for 24 h with either the SNX8 pool (see above) at a final concentration of 1 mg ml−1 per peptide in 0.2% DMSO or a vehicle control containing 0.2% DMSO only. For four of the controls and two of the patients with MIS-C, there were sufficient PBMCs for an additional stimulation condition using the SNX8 high-resolution epitope pool (see above) also at a concentration of 1 mg ml−1 per peptide in 0.2% DMSO for 24 h. Following the stimulation, cells were washed with FACS buffer (Dulbecco’s PBS without calcium or magnesium, 0.1% sodium azide, 2 mM EDTA and 1% FBS) and stained with the following antibody panel each at 1:100 dilution for 20 min at 4 °C, and then flow cytometry analysis was immediately performed.

For the antibody panel: anti-CD3 Alexa 647 (clone OKT3, 317312, BioLegend), anti-CD4 Alexa 488 (clone OKT4, 317420, BioLegend), anti-CD8 Alexa 700 (clone SK1, 344724, BioLegend), anti-OX-40 (also known as CD134) PE-Dazzle 594 (clone ACT35, 350020, BioLegend), anti-CD69 PE (clone FN-50, 310906, BioLegend), anti-CD137 (also known as 4-1BB) BV421 (clone 4B4-1, 309820, BioLegend), anti-CD14 PerCP-Cy5 (clone HCD14, 325622, BioLegend), anti-CD16 PerCP-Cy5 (clone B73.1, 360712, BioLegend), anti-CD19 PerCP-Cy5 (clone HIB19, 302230, BioLegend) and Live/Dead Dye eFluor 506 (65-0866-14, Invitrogen).

The activation-induced marker analysis was performed using FlowJo software using the gating strategy shown in Extended Data Fig. 7a. All gates were fixed within each condition of each sample. Activated CD4 T cells were defined as those that were co-positive for OX40 and CD137. Activated CD8 T cells were defined as those that were co-positive for CD69 and CD137. Gating thresholds for activation were defined by the outer limits of signal in the vehicle controls allowing for up to two outlier cells. Frequencies were calculated as a percentage of total CD3+ cells (T cells). Two MIS-C samples had insufficient total events captured by flow cytometry (total of 5,099 and 4,919 events, respectively) and were therefore removed from analysis.

Initial tetramer assay

For the initial tetramer assay, see Extended Data Fig. 4a. PBMCs from two patients with MIS-C with HLA-A*02:01 (HLA typed from PAXgene RNAseq, one confirmed by serotyping), one patient with MIS-C with HLA-B*35:01 (HLA typed from PAXgene RNAseq) and three at-risk controls with HLA-A*02.01 (all three identified by serotyping, two of three confirmed by PAXgene RNAseq HLA typing; the other sample did not have genomic DNA available for genotyping) were thawed, washed and put into culture with media containing recombinant human IL-2 at 10 ng ml−1 in 96-well plates. The peptide fragments (details above) LQLPQGITL and MQMPQGNPL were then added to PBMCs to a final concentration of 10 mg ml−1 per peptide and incubated (37 °C at 5% CO2) for 7 days.

Following the 7 days of incubation, a total of eight pHLA class I tetramers were generated from UV-photolabile biotinylated monomers, four each from HLA-A*02:01 and HLA-B*35:01 (NIH Tetramer Core). Peptides were loaded via UV peptide exchange. Tetramerization was carried out using streptavidin conjugated to fluorophores PE and APC or BV421 followed by quenching with 500 µM d-biotin, similar to our previously published methods44,53. Tetramers were then pooled together as shown below:

For the HLA-A*02:01 pool, the MADS (LQLPQGITL)-loaded PE tetramer, MADS (LQLPQGITL)-loaded APC tetramer, SNX8 (MQMPQGNPL)-loaded PE tetramer and SNX8 (MQMPQGNPL)-loaded BV421 tetramer were used, all with HLA-A*02:01 restriction.

For the HLA-B*35:01 pool, the MADS (LQLPQGITL)-loaded PE tetramer, MADS (LQLPQGITL)-loaded APC tetramer, SNX8 (MQMPQGNPL)-loaded PE tetramer and SNX8 (MQMPQGNPL)-loaded BV421 tetramer were used, all with HLA-B*35:01 restriction.

All PBMCs were then treated with 100 nM dasatinib (StemCell) for 30 min at 37 °C followed by staining (no wash step) with the respective tetramer pool corresponding to their HLA restriction (final concentration of 2–3 µg ml−1) for 30 min at 25 °C. Cells were then stained with the following cell-surface markers each at 1:100 dilution for 20 min, followed by immediate analysis on a flow cytometer.

For the surface markers: anti-CD8 Alexa 700 (clone SK1, 357404, BioLegend), anti-CD4 PerCP-Cy5 (clone SK1, 300530, BioLegend), anti-CD14 PerCP-Cy5 (clone HCD14, 325622, BioLegend), anti-CD16 PerCP-Cy5 (clone B73.1, 360712, BioLegend), anti-CD19 PerCP-Cy5 (clone HIB19, 302230, BioLegend) and Live/Dead Dye eFluor 506 (65-0866-14, Invitrogen). Streptavidin was conjugated to PE (S866, Invitrogen), APC (S868, Invitrogen) and BV421 (405225, BioLegend).

The gating strategy is outlined in Extended Data Fig. 7b. A stringent tetramer gating strategy was used to identify cross-reactive T cells, in which CD8+ T cells were required to be triple positive for PE, APC and BV421 labels (that is, a single CD8 T cell bound to PE-conjugated LQLPQGITL and/or PE-conjugated MQMPQGNPL in addition to APC-conjugated LQLPQGITL and BV421-conjugated MQMPQGNPL).

Serotyping was performed using an anti-HLA-A2 antibody (1:100 dilution; FITC anti-human HLA-A2 antibody, clone BB7.2, 343303, BioLegend), and pertinent results are shown in Extended Data Fig. 7c.

Assembly of easYmer monomers and fold testing

For the assembly of HLA class I pHLA easYmer monomers and fold testing, see Fig. 4. Unfolded, biotinylated easYmer monomers (Immudex) were obtained for HLA-A*02:01 and HLA-A*02:06. SARS-CoV-2 MADS (LQLPQGITL), SARS-CoV-2 Wuhan (LQLPQGTTL) and human SNX8 (MQMPQGNPL) peptides were commercially synthesized (Genscript), diluted to 1 mM in ddH2O or DMSO, and loaded onto each easYmer allele according to the manufacturer’s instructions at 18 °C for 48 h. Proper pHLA monomer formation and MADS and SNX8 peptide-binding strength were evaluated for each HLA using a ‘β2m fold test’ relative to negative (no peptide; unloaded monomer) and positive (strong binding peptide; CMV pp65 495–503 (NLVPMVATV)) controls as per the manufacturer’s protocol. In brief, peptide-loaded monomers with a concentration of 500 nM were serially diluted to 9 nM, 3 nM and 1 nM in dilution buffer (1× PBS with 5% glycerol; G5516, Sigma-Aldrich) and incubated with streptavidin beads (6–8 μm; SVP-60-5, Spherotech) at 37 °C for 1 h to allow binding of stable complexes to beads, then washed three times with FACS buffer (1× PBS, 0.5% BSA (A7030, Sigma-Aldrich) and 2 mM EDTA (15575-038, Thermo Fisher Scientific)). Samples were then stained with PE-conjugated anti-human β2m antibody (clone BBM.1, sc-13565, Santa Cruz Biotech) at 1:200 for 30 min at 4 °C, washed three times with FACS buffer and analysed on a 5 Laser 16UV-16V-14B-10YG-8R AURORA spectral cytometer (Cytek). pHLA-binding strength positively correlated with stability and concentration of the pHLA–β2m complex. Therefore, the geometric mean fluorescence intensity of anti-β2m staining in this assay reports on the strength of the pHLA binding compared with the positive and negative controls. We classified binding strength for each HLA and peptide combination based on the fold change in anti-β2m geometric mean fluorescence intensity over the no-peptide negative control at 9 nM. Strong binders were defined at more than 10-fold higher, moderate binders at more than 3-fold, weak binders at more than 1.5-fold and non-binders at less than 1.5-fold change over the negative control. Flow cytometry data were analysed using FlowJo version 10.7.2 software (BD Biosciences).

pHLA tetramer assembly

For the pHLA tetramer assembly, see Fig. 4. pHLA tetramers were assembled from HLA-A*02:01 and HLA-A*02:06 easYmer monomers (Immudex) with confirmed peptide binding to SARS-CoV-2 MADS (LQLPQGITL), Wuhan (LQLPQGTTL) and SNX8 (MQMPQGNPL) peptides according to the manufacturer’s instructions. In brief, fluorochrome-conjugated streptavidin (0.2 mg ml−1, PE, 405203, BioLegend; 0.2 mg ml−1, APC, 405207, BioLegend; and BV421, 405226, BioLegend) was added to loaded monomers at 8 ng per 1 μl pHLA complex (500 nM) in three volumes. After each 1/3 volume addition, samples were mixed and incubated for 15 min at 4 °C in the dark. Assembled tetramers were stored at 4 °C in the dark until use.

Enhanced peptide-specific T cell expansion

For enhanced peptide-specific T cell expansion, see Fig. 4. PBMCs from MIS-C confirmed participants with HLA-A*02:01 or HLA-A*02:06 were obtained for peptide-specific expansion according to published methods54 before single-cell sorting of tetramer-positive T cells. On expansion day 0, PBMCs were thawed, counted and seeded onto 96-well round-bottom plates at 100,000 cells per well in 200 μl antigen-presenting cell differentiation media (X-VIVO 15 serum-free haematopoietic cell medium (04-418Q, Lonza) supplemented with human GM-CSF (1,000 IU ml−1; 130-095-372, Miltenyi Biotec), human IL-4 (500 IU ml−1; 204-IL-010, R&D Systems) and human Flt3-L (50 ng ml−1; 308-FKN-025, R&D Systems) final concentrations) and incubated for 24 h at 37 °C and 5% CO2. On day 1, 100 μl cell supernatant was replaced with 100 μl Adjuvant Solution (X-VIVO 15 supplemented with R848 (10 μM; tlrl-r848-5, InvivoGen), lipopolysaccharide (Salmonella minnesota; 100 ng ml−1; tlrl-smlps, InvivoGen) and human IL-1β (10 ng ml−1; 201-LB-010, R&D Systems) final concentrations) and pooled MADS (LQLPQGITL) and SNX8 (MQMPQGNPL) peptides at a final concentration of 10 μM each. No-peptide control wells were set up for each sample by adding a 1:2 dilution of DMSO in H2O to match the peptide volume and diluent. Cells were incubated for 24 h at 37 °C and 5% CO2. On days 2, 4, 7 and 9, 100 μl supernatant was replaced with 100 μl T cell expansion solution: RP-10 (RPMI 1640 (22400-089, Gibco), 10% heat-inactivated human serum AB (100-512, Gemini Bio-Products), 10 mM HEPES, 0.1 mg ml−1 gentamicin (15750-060, Thermo Fisher Scientific) and 1× GlutaMAX (35050-061, Gibco)) supplemented with human IL-2 (10 IU ml−1; 202-IL-050, R&D Systems), human IL-7 (10 ng ml−1; 207-IL-025, R&D Systems) and human IL-15 (10 ng ml−1; 200-15, PeproTech) final concentrations. On day 10, peptide-expanded cells from an individual participant were pooled; cells from no-peptide controls were collected separately.

Single-cell index sorting

Unexpanded PBMCs (direct ex vivo) or peptide-expanded T cells were obtained, washed in 1× PBS and treated with 100 nM dasatinib (CDS023389, Sigma-Aldrich) in 1× PBS for 30 min at 37 °C and 5% CO2 (ref. 55). Cells were then pelleted and resuspended in 50 μl FACS buffer (1× PBS and 0.04% BSA) supplemented with human TruStain FcX blocking buffer (1:10 dilution; 422302, BioLegend), 500 μM d-biotin (B20656, Thermo Fisher Scientific) and a unique tetramer cocktail containing MADS–tetramer–PE (1:10 dilution), MADS–tetramer–APC (1:10 dilution), SNX8–tetramer–PE (1:10 dilution) and SNX8–tetramer–BV421 (1:10 dilution) based on participant HLA type (A*02:01 and A*02:06). Cells were incubated in the dark at 25 °C for 1 h followed by direct addition of 50 μl (100 μl total volume) of FACS supplemented with 500 μM d-biotin and an antibody cocktail containing FITC-conjugated anti-human CD3 (1:20 dilution; clone OKT3, lot B390808, 317306, BioLegend), BV605-conjugated anti-human CD8 (1:20 dilution; clone SK1, lot B371925, 344742, BioLegend), BV510-conjugated anti-human CD4 (1:20 dilution; clone OKT4, lot B375526, 317444, BioLegend), BV510-conjugated anti-human CD14 (1:20 dilution; clone 63D3, lot B390770, 367124, BioLegend), BV510-conjugated anti-human CD16 (1:20 dilution; clone 3G8, lot B372132, 302048, BioLegend), BV510-conjugated anti-human CD19 (1:20 dilution; clone HIB19, lot B390665, 302242, BioLegend) and Ghost Dye Violet 510 Viability Dye (1:400 dilution; lot D0870061322133, 13-0870-T500, Tonbo Biosciences) for 30 min in the dark at 4 °C. Cells were then pelleted, washed twice with 4 ml FACS buffer (containing 500 μM d-biotin), suspended in 500 μl FACS (containing 500 μM d-biotin) and passed through a 45-μM filter before proceeding to single-cell sorting on a Sony SY3200 cell sorter. Individual, live, BV510 dump gate (CD4, CD14, CD16 and CD19)-negative, CD3+CD8+ T lymphocytes were gated to distinguish tetramer triple-positive cells (PE+APC+BV421+) as described in Extended Data Fig. 7d and sorted into individual wells of a 384-well plate loaded with Superscript VILO master mix (11754250, Thermo Fisher Scientific). After sorting, plates were centrifuged at 500g and stored at −80 °C until processing.

Paired TCRαβ amplification and sequencing

Single-cell paired TCRα and TCRβ chain library preparation and sequencing was performed on T cells sorted into 384-well index plates as previously described56. In brief, after reverse transcription of cells sorted in Superscript VILO master mix, cDNA underwent two rounds of nested multiplex PCR amplification using a mix of human V-segment-specific forward primers and human TRAC and TRBC segment-specific reverse primers (see Supplementary Table 1 for primer details). Resulting TCRα and TCRβ amplicons were sequenced on an Illumina MiSeq at 2 × 150-bp read length.

Cell lines

All cultured cell lines were maintained at 37 °C and 5% CO2 in a humidified incubator. HEK 293T cells (CRL-3216, American Type Culture Collection) were purchased from the American Type Culture Collection and verified commercially. HEK 293T cells were cultured in DMEM (11965-092, Gibco) supplemented with 10% FBS (16140-071, Gibco), 2 mM l-glutamine (25030-081, Gibco) and 100 U ml−1 penicillin–streptomycin (15140-122, Gibco). 2D3 Jurkat J76.7 cells57,58 (TCR-null, CD8+) expressing an NFAT–eGFP reporter were kindly provided by F. Fujiki and were cultured in RPMI 1640 (22400-089, Gibco) supplemented with 10% FBS, 2 mM l-glutamine and 100 U ml−1 penicillin–streptomycin. All cell lines were confirmed to be mycoplasma negative during the course of experiments.

TCR repertoire analysis

TCR similarity networks were constructed as previously described49,59. In brief, to measure the distance between TCRαβ clonotypes, we used the TCRdist algorithm implementation from the CoNGA v0.1.2 Python package47. Further analysis was performed using the R language for statistical computing, with merging and subsetting of data performed using the dplyr v1.1.4 package. TCR similarity networks were built using stringdist v0.9.12 and igraph v2.0.3 (ref. 60) R packages, and visualized using gephi v0.9.7 (ref. 61) software.

TCR reconstruction and cloning

Full-length TCRαβ sequences were reconstructed from V/J gene usage and CDR3 sequences using Stitchr v1.0.0 (ref. 62) for each index-sorted T cell. TCRα and TCRβ chain sequences were modified to use murine constant regions and joined by a 2A element from thosea asigna virus (T2A). A sequence encoding mCherry was additionally appended by a 2A element from porcine teschovirus (P2A) as a fluorescent marker of transduction. The full-length gene fragment encoding TCRβ–T2A–TCRα–P2A–mCherry was synthesized and cloned commercially (Genscript) into the lentiviral vector pLVX-EF1α-IRES-Puro (631253, Takara).

Generation of TCR-expressing Jurkat cells

To generate transducing particles packaging individual TCRs of interest (Fig. 4c), HEK 293T cells were transduced with a pLVX lentiviral vector encoding a unique TCRαβ–mCherry insert, psPAX2 packaging plasmid (plasmid #12260, Addgene) and an pMD2.G envelope plasmid (plasmid #12259, Addgene) at a ratio of 4:3:1. At 24 h and 48 h post-transfection, viral supernatants were harvested, passed through a 0.45-µm SFCA filter (723-9945, Thermo Fisher Scientific), concentrated using Lenti-X Concentrator (631232, Takara) and stored at −80 °C as single-use aliquots. To generate TCR-expressing Jurkat cell lines (Jurkat-TCR+), 2D3 Jurkat J76.7 cells (TCR-null, CD8+, NFAT–eGFP reporter) were seeded in a 12-well tissue-culture-treated plate at 1 × 106 cells per well in complete RPMI (RPMI 1640, 10% FBS, 2 mM l-glutamine, 100 U ml−1 penicillin–streptomycin) and transduced by adding concentrated lentivirus dropwise to each well. At 48–72 h post-tranduction, puromycin was added at 1 μg ml−1 and cultured for 1 week to select for transduced cells. Jurkat-TCR+ cell lines were validated for the presence of correctly folded TCR on the cell surface by flow cytometry using a monoclonal antibody targeting the mouse TCRβ constant region (APC/Fire750-conjugated; clone H57-597, 109246, BioLegend; Extended Data Fig. 5a). Flow cytometry data were collected on a custom-configured BD Fortessa using FACSDiva software (v8.0.1; Becton Dickinson) and analysed using FlowJo version 10.7.2 software (BD Biosciences).

Specificity validation of putative cross-reactive TCR sequences

The specificity of TCR-expressing Jurkat T cell lines was validated by tetramer staining using the same reagents used for single-cell sorting PBMCs (above). In brief, 1 × 106 Jurkat-TCR+ cell lines or untransduced Jurkat J76.7 (TCR-null; background control) were washed in 1× PBS and resuspended in 50 μl FACS buffer (1× PBS and 0.04% BSA) and a unique tetramer cocktail containing MADS–tetramer–PE (1:10 dilution), MADS–tetramer–APC (1:10 dilution), SNX8–tetramer–PE (1:10 dilution) and SNX8–tetramer–BV421 (1:10 dilution) based on the restricting HLA type (A*02:01 and A*02:06). Tetramers conjugated to the Wuhan peptide sequence (LQLPQGTTL), including Wuhan–tetramer–PE (1:10 dilution) and Wuhan–tetramer–APC (1:10 dilution), were also tested. A second set of wells were set up in which each individual tetramer was used to stain cells. Cells were incubated in the dark at 25 °C for 30 min after which 50 µl of FACS buffer containing Ghost Dye Violet 510 Viability Dye (1:400 dilution; lot D0870061322133, 13-0870-T500, Tonbo Biosciences) was added for an additional 30-min incubation in the dark at 25 °C. Cells were then washed twice with 1 ml FACS buffer, suspended in 300 μl FACS and analysed by flow cytometry on a custom-configured BD Fortessa using FACSDiva software (v8.0.1; Becton Dickinson). Cell population gating and fluorescence analysis was performed using FlowJo version 10.7.2 software (BD Biosciences) as described in Extended Data Fig. 7e.

scRNA-seq analysis

To assess the cell-type specificity in a relevant disease context, we analysed SNX8 expression from a single-cell sequencing of PBMC samples from patients with severe, mild or asymptomatic COVID-19 infection, influenza virus infection and healthy controls48. Gene expression data from 59,572 pre-filtered cells were downloaded from the Gene Expression Omnibus database under accession GSE149689 for analysis and downstream processing with scanpy v1.10.0 (ref. 63). Cells with (1) less than 1,000 total counts, (2) less than 800 expressed genes, and (3) more than 3,000 expressed genes were filtered out as further quality control, leaving 42,904 cells for downstream analysis. Gene expression data were normalized to have 10,000 counts per cell and were log1p transformed. Highly variable genes were calculated using the scanpy function highly_variable_genes using Seurat flavor with the default parameters (min_mean = 0.0125, max_mean = 3, and min_disp = 0.5)64. Only highly variable genes were used for further analysis. The total number of counts per cell was regressed out, and the gene expression matrix was scaled using the scanpy function scale with max_value = 10. Dimensionality reduction was performed using principal components analysis with 50 principal components. Batch balanced k-nearest neighbours, implemented with scanpy’s function bbknn, was used to compute the top neighbours and normalize batch effects65. The batch-corrected cells were clustered using the Leiden algorithm and projected into two dimensions with uniform manifold approximation and projection for visualization. Initial cluster identity was determined by finding marker genes with differential expression analysis performed using a Student’s t-test on log1p-transformed raw counts with the scanpy function rank_genes_groups66,67.

Statistical methods

All statistical analysis was performed in Python using the Scipy Stats package unless otherwise indicated. For comparisons of distributions of PhIP-seq enrichment between two groups, a non-parametric Kolmogorov–Smirnov test was utilized. For logistic-regression feature weighting, the Scikit-learn package68 was used, and logistic-regression classifiers were applied to z-scored PhIP-seq values from individuals with MIS-C versus at-risk controls. A liblinear solver was used with L1 regularization, and the model was evaluated using a five-fold cross-validation (four of the five for training, and one of the five for testing). For the RLBAs and SLBAs, first an antibody index was calculated as follows: (sample value – mean blank value)/(positive control antibody values – mean blank values). For the alanine mutagenesis scans, blank values of each construct were combined, and a single mean was calculated. A normalization function was then applied to the experimental samples only (excluding antibody-only controls) to create a normalized antibody index ranging from 0 to 1. Comparisons between two groups of samples were performed using a Mann–Whitney U-test. An antibody was considered to be ‘positive’ when the normalized antibody index in a sample was greater than 3 s.d. above the mean of controls. When comparing two groups of normally distributed data, a Student’s t-test was performed.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.