Main

High-throughput single-cell technologies such as flow cytometry and mass cytometry, which can measure features on millions of individual cells, and high-dimensional single-cell technologies such as single-cell RNA sequencing (scRNA-seq), which can measure potentially thousands of features in individual cells, are well suited to support studies of the heterogeneity of immune responses and of how immune cells interact with other host cells and with pathogens. Specific applications of single-cell technologies in the field of immunology include identifying host immunological correlates of disease severity (potentially aiding the design of effective vaccines and therapeutics as well as the monitoring of each person’s response to these approaches), elucidating molecular mechanisms of disease and enabling the identification of predictive biomarkers of disease outcome.

The ongoing COVID-19 pandemic has been described as ‘an explosive pandemic of historic proportions’ (ref. 1), with over 250 million confirmed cases and over 5 million confirmed deaths worldwide2. In addition to basic measures such as physical distancing and mask wearing, optimal management of the pandemic may involve a diverse armamentarium of scientific tools including effective and safe preventative vaccines3, early therapeutic interventions that can blunt progression to severe disease4,5 and anti-inflammatory treatments to counteract the harmful ‘cytokine storm’ in patients with severe disease6,7,8. Toward the development of these tools, there is an urgent need to understand SARS-CoV-2 interactions with host cells and the host immune response9.

Here we provide an overview of the single-cell technologies that have been applied to COVID-19 studies and list the pertinent experimental aspects of each study (including sample size, technology platform and patient characteristics). We describe our efforts to organize and curate available single-cell sequencing datasets into an easily downloadable format, providing information on how to access these datasets. We also review key insights obtained from single-cell immune profiling and discuss opportunities and challenges of integrative analysis of publicly available datasets.

Single-cell technologies and available datasets

Single-cell technologies that have been used in COVID-19 studies to date are summarized in Table 1. They include 62 published articles and two preprint articles describing studies that applied one or more single-cell technologies in the context of COVID-19. Figure 1 displays the sample size and dimensionality of the studies; Supplementary Table 1 presents relevant experimental details and locations of publicly available datasets (raw FCS files are publicly available for six flow cytometry and three mass cytometry datasets, and 24 single-cell sequencing datasets are publicly available). Whereas flow cytometry is the most used technology in these studies, scRNA-seq and single-cell multi-omic profiling are also increasingly being used. Flow or mass cytometry studies analyzed data from up to ~300 individuals and up to 62 markers with one or multiple panels; 25 of the 54 datasets include longitudinal data (Fig. 1a). Most single-cell sequencing studies analyzed >50,000 cells from fewer than 150 individuals; only a few of them included longitudinal data (Fig. 1b).

Table 1 Summary of single-cell technologies that have been applied to study COVID-19
Fig. 1: Visual representation of characteristics of the 64 published articles or publicly posted preprints on COVID-19 that have used one or more single-cell technologies (March 2020–March 2021).
figure 1

a,b, Scatterplots showing the number of participants versus the number of flow cytometry markers (a) or the number of cells for which data are available in each study (b). Each symbol represents a dataset using one of the single-cell technologies from a single study. Opacity indicates dataset availability to the public (light, no; dark, yes); shape indicates whether the dataset has longitudinal data (circle, no; triangle, yes); and color indicates assay type (red, flow cytometry; cyan, mass cytometry (a); red, repertoires; gold, RNA; green, RNA and protein; blue, RNA, protein and repertoires; magenta, RNA and repertoires (b)).

In the following text, we briefly summarize key conclusions from the 64 studies shown in Fig. 1, focusing on the insights obtained via single-cell technologies, including studies still at the preprint stage or that have relatively small sample sizes. As most studies were performed in peripheral blood mononuclear cells (PBMCs), only limited conclusions can be drawn about the respiratory tract, the primary site of infection. Most studies focused on transcriptional (as opposed to protein or epigenetic) readouts. In the sections below, we summarize these findings in the context of innate immune cells, B cells and T cells, finally summarizing how these single-cell data may correlate with immune protection.

Innate immune responses

Most flow and mass cytometry-based studies of COVID-19 that analyzed PBMCs from patients with COVID-19 report reduced frequencies or abundances of circulating basophils10,11,12, monocytes13,14,15 (especially CD14loCD16hi non-classical monocytes16), dendritic cells (DCs)10,14,15,17 and natural killer (NK) cells14,15,17,18,19 when compared with those from healthy donors, with greater reductions in individuals with severe COVID-19 than in those with mild COVID-19 (refs. 15,16,17,18,19). Conversely, patients with COVID-19 have shown increased frequencies or abundances of circulating neutrophils, eosinophils and monocytic myeloid-derived suppressor cells compared with healthy donors, with greater increases in individuals with severe COVID-19 than in those with mild COVID-19 (refs. 11,14,16,17,19). The neutrophil-to-lymphocyte ratio was also reported to be associated with severe COVID-19 (ref. 17).

NK cells

High-dimensional flow cytometry has enabled in-depth characterization of immune cell subsets. A report featuring a 28-color NK-cell-oriented panel described fewer circulating (yet more highly activated and proliferating) NK cells in patients with COVID-19 than in healthy controls18. Worse clinical outcomes were also correlated with increased levels of circulating adaptive NK cells (NKG2C+CD57+CD56dim) and higher levels of perforin expression in CD56bright NK cells18, implicating adaptive and activated NK cells in COVID-19 pathogenesis.

The above study is of particular notice because it incorporated an analysis of publicly available scRNA-seq data (NK cells in bronchoalveolar lavage fluid (BALF) from patients with COVID-19 (ref. 20)). The data reveal high activation of NK cells in COVID-19 and corroborate results from flow cytometry18.

Neutrophils

Neutrophil extracellular traps have been implicated in severe COVID-19 (refs. 21,22). A ‘developing neutrophil’ subpopulation, specifically increased in patients with acute respiratory distress syndrome, was identified through scRNA-seq and featured expression of neutrophil granule-related genes and a lack of expression of canonical neutrophil markers15.

Another study featuring scRNA-seq, high-dimensional flow cytometry and mass cytometry reported that severe COVID-19 was associated with a substantial increase in circulating immature neutrophils and the presence of a neutrophil cluster characterized by upregulated S100A8 and S100A9 (calprotectin) among other genes16.

DCs

One study23 is notable due to its relatively large sample size (n = 52 patients with COVID-19 and n = 62 healthy controls) and its use of a phospho-specific cytometry by time of flight (CyTOF) panel to immunophenotype PBMCs along with cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) to profile gene and protein expression in DC-enriched PBMC samples from patients with COVID-19. One major observation from this study is the decrease in the frequency of plasmacytoid DCs (pDCs) in patients with COVID-19 and that expression of mammalian target of rapamycin (mTOR) signaling proteins was reduced in these pDCs, suggesting that pDCs may have deficient interferon (IFN)-α signaling in patients with COVID-19.

Monocytes

In another study also distinguished by a relatively large sample size (n = 53 patients with COVID-19, n = 8 patients with flu-like illness and n = 48 controls over two independent cohorts), researchers used single-cell transcriptomic (scRNA-seq) and proteomic (CyTOF) interrogation24 to test whether distinct innate immune responses are associated with the clinical course of COVID-19. The study revealed that, compared with healthy controls, patients with mild COVID-19 had increased levels of inflammatory HLA-DRhiCD11chiCD14+ monocytes. Conversely, patients with severe COVID-19 were distinguished not only by monocyte populations with low expression of HLA-DR (indicative of monocyte dysfunction25) and enhanced expression of genes related to anti-inflammatory macrophage functions (such as SELL (CD62L) and CD163) but also by an abundance of immature neutrophils (including pro- and pre-neutrophils) expressing markers indicative of immunosuppression or dysfunction. These findings identify a dysfunctional monocyte response as well as dysregulated myelopoiesis as potentially important processes underlying the development of severe disease.

As discussed below, immune cell phenotyping via single-cell technologies has yielded potential insights into the immune pathways that may be dysregulated in severe COVID-19 (notably COVID-19-associated cytokine storm26) and in multisystem inflammatory syndrome (MIS-C) associated with SARS-CoV-2 infection. These insights could in turn inform ongoing and future development of anti-inflammatory treatment strategies for COVID-19 by targeting immunological factors with therapeutic potential. Evidence that such anti-inflammatory treatment strategies may be effective is provided by the encouraging preliminary results of the RECOVERY trial of dexamethasone in patients who were hospitalized with COVID-19 (ref. 27) (which caused ongoing corticosteroid trials28,29,30 to stop early based on data and safety monitoring board recommendations), along with the results of meta-analyses of randomized clinical trials of corticosteroids in patients with severe COVID-19 (refs. 31,32). Moreover, flow cytometry, mass cytometry and scRNA-seq have all demonstrated reduced HLA-DR and CD86 expression on monocytes in patients with severe COVID-19 (refs. 16,23,33) as well as in children with MIS-C associated with SARS-CoV-2 infection34, implying potentially impaired antigen presentation to T cells. Downregulation of HLA-DR on monocytes could potentially be driven by interleukin (IL)-6, which has been shown to be elevated in patients with severe COVID-19 (refs. 10,19,35,36) and in pediatric patients with MIS-C34,37, as decreased HLA-DR expression can be partially restored by tocilizumab33 (a humanized immunoglobulin (Ig)G1κ monoclonal antibody that targets IL-6 receptor (IL-6R) and blocks IL-6 signaling38).

Tocilizumab has no efficacy in preventing disease progression39 or intubation or death in hospitalized patients with moderate COVID-19 (ref. 40); however, the latter study40 had wide confidence intervals in efficacy comparisons, and another report41 suggested potential benefit of tocilizumab in reducing the need for ventilation and mortality. Thus, the question of whether IL-6R blockade can benefit patients with moderate or severe COVID-19 disease remains open41, and other chemokine receptor blockers are also being tested in ongoing clinical trials42,43. Additional applications of single-cell technologies in this context yielded the identification of the chemokine receptor CCR1 (refs. 11,44) as a potential therapeutic target based on scRNA-seq of nasopharyngeal samples from patients with critical or non-critical COVID-19 (ref. 44) or of nasopharyngeal and bronchial samples from patients with moderate or critical COVID-19 (ref. 11).

Macrophages

Multiple single-cell studies11,20,44,45 of airway and alveolar cells have also yielded insights into the role of macrophages in COVID-19, specifically that a dysregulated macrophage response may drive pathological inflammation46. Patients with critical COVID-19 have manifested increased ligand–receptor interactions between epithelial cells and immune cells, upregulation of pro-inflammatory chemokine and cytokine genes in non-resident macrophages and CCR1 upregulation in neutrophils, macrophages and CD8+ T cells11. These findings suggest the influence of cycles of recruitment of immune cells to the lung (monocytes that differentiate into inflammatory macrophages, further recruitment and activation of more immune cells) on the epithelial damage observed in severe COVID-19. scRNA-seq analysis of cells in BALF has revealed that the proportion of bronchoalveolar macrophages and the levels of inflammatory cytokine and chemokine receptors are positively associated with disease severity20. Macrophages in severe COVID-19 were also distinguished by high expression of FCN1 (encoding a member of the complement cascade) and SPP1 (encoding a pro-inflammatory cytokine), suggesting that alveolar macrophages may drive local inflammation in patients with severe COVID-19 (ref. 20). Finally, flow cytometry and scRNA-seq data of cells in BALF from patients with pneumonia caused by SARS-CoV-2 infection suggest that high levels of monocytes, as well as CD4+ and CD8+ T cells, are found in the alveolar space45. scRNA-seq identified multiple clusters corresponding to tissue-resident alveolar macrophages and monocyte-derived alveolar macrophages, along with expression of IFNG in T cells from patients with pneumonia caused by SARS-CoV-2 infection. The identification of an IFN response signature by bulk sequencing of flow cytometry-sorted alveolar macrophages and the finding of SARS-CoV-2 RNA in alveolar macrophages (which suggests that SARS-CoV-2 can replicate in alveolar macrophages) gives strength to the hypothesis that activated T cells in severe COVID-19 release IFN-γ. In turn, this IFN-γ drives an IFN response in alveolar macrophages that leads to the recruitment of monocyte-derived alveolar macrophages, completing an ‘inflammatory signaling loop’ (ref. 45).

An alternative RNA-seq analysis of PBMCs from four healthy donors, five patients with influenza infection and eight patients with COVID-19 reported that classical monocytes in those with severe COVID-19 are distinguished by an IFN type I (IFN-I)-related transcriptional signature and an IL-1β-related inflammatory transcriptional signature47. This finding generated the hypothesis that the IFN-I response contributes to detrimental inflammation in severe COVID-19. In a study with a much larger sample size (including 130 patients with COVID-19 from three different centers in the UK), scRNA-seq and quantification of 188 cell surface proteins48 revealed a positive association between the frequency of proliferating monocytes and MKI67- and TOP2A-expressing DCs with COVID-19 disease severity. The same study also described how platelet expansion was associated with severe COVID-19, along with enhanced interactions of platelets with C1+CD16+ monocytes and CD16+ monocytes in COVID-19. These findings lend support to the role of both platelets and monocytes in the tissue thrombosis that has been reported in COVID-19 (ref. 49).

B cell responses

Neutralizing antibodies (nAbs) have been heavily implicated in protection against SARS-CoV-2 infection and COVID-19 disease50,51,52,53,54, and thus intense interest has focused on identifying potent nAbs against SARS-CoV-2. Such analysis requires a single-cell approach, owing to the extensive variable (V), diversity (D) and joining (J) gene (VDJ) recombination and somatic hypermutation in B cells55.

Immunoglobulin sequencing

A general strategy for identifying relevant antibody sequences has been to sort and process, via scRNA-seq, individual antigen-specific memory B cells (for example, specific to the receptor-binding domain (RBD) in the spike glycoprotein56 or the spike trimer15,57,58,59) from individuals convalescing from COVID-19. VDJ sequencing performed at the single-cell level (scVDJ-seq34) has led to identification of nAbs with prophylactic or therapeutic efficacy against SARS-CoV-2, feeding into the robust pipeline of nAb clinical trials60.

In a study that employed high-throughput scRNA–VDJ-seq, about 9,000 RBD-binding B cell clonotypes were identified, yielding 14 potent nAbs, one of which was shown to protect against SARS-CoV-2 in a mouse model56. The coupled scRNA-seq data allowed the identification of naive and memory B cell subsets and helped improve the efficiency of nAb selection by filtering out clonotypes enriched in naive and exhausted B cells56. A different study also using scRNA–VDJ-seq identified 19 potent nAbs, including RBD-binding and non-RBD-binding nAbs, one of which was shown to protect against SARS-CoV-2 in a hamster model61. The integration of two parallel workflows, both of which featured scRNA-seq and one of which incorporated single-cell functional assays, has been applied in combination with scVDJ-seq to identify five major classes of nAbs with different reactivities to the spike glycoprotein and cross-reactivity with SARS-CoV-2 (ref. 54). A multi-omic integration of single-cell data, including scRNA-seq and CITE-seq, along with B cell receptor (BCR) and T cell receptor (TCR) sequencing, of PBMCs from patients with ‘stable’ COVID-19 (who were hospitalized and ultimately discharged) and patients with ‘progressive’ COVID-19 (who were treated in the intensive care unit (ICU) and ultimately succumbed to the disease) supported a complex B cell response in COVID-19, including a high proportion of unmutated Igγ heavy chain (IGHG) B cell clones present alongside multiple mutated B cell clones that did not appear to increase levels of somatic hypermutation over time62. The latter could potentially be explained by memory B cell cross-reactivity with other coronaviruses or failed formation of robust germinal center reactions59.

B cell markers

High-dimensional flow cytometry has been used to characterize and compare B cell responses in patients with different severities of COVID-19 disease. A 24-marker B-cell-focused panel designed to identify B cell populations, evaluate their activation status and assess homing potential63 allowed the identification of a correlation between overactivation of extrafollicular B cells and COVID-19 disease severity, along with greater expansion of antibody-secreting cells in severe versus milder disease. Patients with severe disease had high serum titers of RBD-targeting SARS-CoV-2 nAbs, raising the question of whether this distinct B cell response in patients with severe COVID-19 is ineffective or potentially even pathogenic. Other studies have described how the humoral immune response (as assessed by BCR clonal expansion and B cell activation) could be correlated with disease severity64; however, it remains an open question whether these positive correlations are simply due to increased initial viral load. This is a particularly complex question, with studies reporting direct65,66,67,68,69,70,71,72, inverse73,74 or no75 correlation. These discrepant findings may be the result of differences in sampling compartment68 (saliva, blood or anal), timing of sample collection and population differences. In any case, additional studies are warranted to define the cellular and molecular determinants that dictate protective versus non-protective humoral responses in infected individuals. The potential of antibody-dependent enhancement of COVID-19 disease must also be considered76.

Convergent antibody clusters (antibodies with highly similar VDJs shared by multiple patients, which generally comprise only a small proportion of the virus-specific B cell response77), have also been identified78, with the suggestion that the majority of patients with COVID-19 have convergent Ig heavy chains against the RBD, which may bode well for spike-based or RBD-based vaccines.

B cell phenotypes

In a study79 investigating longitudinal samples from individuals who had recovered from mildly symptomatic COVID-19, researchers used RBD tetramer enrichment and flow cytometry to phenotype the rare population of RBD-specific B cells. Numbers of RBD-specific memory B cells increased from 1 to 3 months after symptom onset and were substantially higher in COVID-19 samples than in healthy controls. Moreover, numbers of memory B cells displaying a TbetloIgG+CD21+CD27+ phenotype (that is, a ‘classical’ memory B cell phenotype) also increased from 1 to 3 months after symptom onset, whereas levels of Tbethi memory B cells, which are typically found in chronic infections, remained low. The question of whether SARS-CoV-2-specific memory B cells can produce nAbs after reactivation by secondary infection has also been addressed: sorting and sequencing of RBD-specific single B cells and their BCRs suggested that memory B cells may help protect from secondary infections.

Human tissue-imaging platforms coupled with multi-color immunofluorescence and multispectral imaging via quantitative automated scanning microscopy have been used to study, at the single-cell level, thoracic lymph nodes and spleens obtained via from autopsies of patients who succumbed to COVID-19 (ref. 59). Compared with single-cell approaches requiring dissociation of tissue, an advantage of this approach is that tissue architecture could be largely preserved, enabling study of cell–cell interactions. A loss of germinal centers in the lymph nodes of these patients, accompanied by substantial reductions in numbers of germinal center B cells and follicular helper T (TFH) cells in the lymph nodes and spleens, was reported using this approach59. As optimal germinal center reactions are essential for the production of high-affinity antibodies80, these results suggest that the increased proportions of plasmablasts observed in patients with COVID-19 (particularly in those with severe COVID-19)10,15,17,59 are a correlate of suboptimal antiviral humoral immunity and disease rather than protection. Other emerging single-cell imaging-based techniques, such as single-cell spatial transcriptomics, may be particularly relevant to the emerging field of pathological studies in organs from individuals who succumbed to COVID-19 (refs. 81,82).

T cell responses

Multiple studies using flow cytometry of PBMCs from patients with COVID-19 have reported T lymphopenia: compared with the lower limit of normal or with levels observed in healthy controls, substantially reduced T cell14,17,19,83,84,85, CD4+ T cell17,57,58,83,84,85, CD8+ T cell10,17,57,58,83,84,85, CD8+ mucosal-associated invariant T (MAIT) cell17, γδ T cell10, αβ T cell (some subsets)10 and regulatory T cell84 counts have been observed in patients with mild or severe COVID-19. Similar T cell lymphopenia has been observed in patients with COVID-19-associated MIS-C34,37,86,87. While it has been reported13 that reductions in CD4+ and CD8+ T cell numbers were comparable between patients with COVID-19 and patients with influenza, some researchers have observed33 that the reduction in CD4+ T cell numbers is greater in patients with COVID-19 than that in patients with influenza. Decreased T cell populations have also been associated with disease severity, as numbers of both CD4+ (refs. 57,84) and CD8+ (ref. 57) T cells were shown to be substantially reduced in patients with severe versus moderate or mild COVID-19 or in ICU versus non-ICU cases83. An increased CD4+/CD8+ T cell ratio has also been reported in patients with COVID-19 (refs. 19,58), suggesting that SARS-CoV-2 infection might preferentially impact CD8+ T cells.

T cell phenotype

Multiparametric flow cytometry has also revealed phenotypic and functional alterations in T cells of patients with COVID-19, with a general theme emerging of hyperactivation of T cells in COVID-19, as demonstrated by elevated subpopulations expressing activation, proliferation or exhaustion markers. Levels of activated (HLA-DR+CD38+ (refs. 10,14,17,58), CD38+ (ref. 88), HLA-DR+ (ref. 89), CD25+CD4+ (ref. 10) and CD25+CD8+ (ref. 10)), proliferating (Ki67+CD8+ (refs. 58,88) or Ki67+CD4+ (ref. 88)) and exhausted (PD-1+CD8+ and/or PD-1+CD4+ (refs. 83,88), TIGIT+ (ref. 89) or NKG2A+ (ref. 90)) T cells have been shown to dramatically increase in patients with COVID-19 or in patients with severe COVID-19 compared with healthy controls, and most activated (CD38+PD-1+) CD8+ T cells in patients with acute COVID-19 have been shown to be specific for SARS-CoV-2 (ref. 88). However, perhaps due to differences in the timing of sampling, healthy controls, patients with influenza and patients with COVID-19 with similarly low abundances of HLA-DR+CD38+ activated CD8+ T cells have been described. Notably, scRNA-seq and single-cell TCR sequencing (scTCR-seq) analysis of CD8+ T cells in BALF samples identified signatures of tissue-resident memory T cells and increased levels of clonal expansion associated with mild disease, whereas elevated proliferative capacity was correlated with severe disease20. scRNA-seq analysis of cerebrospinal fluid leukocytes isolated from patients with COVID-19 and neurological sequelae (‘neuro-COVID’) has uncovered high levels of CD4+ T cells expressing exhaustion markers (for example, ICOS, HAVCR2 (TIM3) and CD226) compared with those in control patients.

Flow cytometry and scRNA-seq have also identified altered T cell differentiation and cytotoxicity in COVID-19. Patients with severe COVID-19 had higher proportions of circulating cytotoxic CD8+ T cells than healthy controls17, and CD8+ T cells in nasopharyngeal and bronchial samples from patients with severe COVID-19 have increased expression of cytotoxic molecules11. This increased cytotoxicity has been proposed to contribute to epithelial damage11. Regarding differentiation, a decrease in peripheral naive and central memory CD8+ T cells and an increase in senescent and effector memory CD45RA+CD8+ T cells in patients with COVID-19 (ref. 19) have been reported, suggesting a skew toward terminal differentiation. In another study59, numbers of CD4+BCL-6+ germinal center type TFH cells were substantially decreased in thoracic lymph node and spleen autopsy tissue from patients with COVID-19, accompanied by high levels of tumor necrosis factor (TNF)-α at the follicle and increased numbers of T helper type 1 (TH1) cells. These findings suggest that COVID-19 impairs TFH cell differentiation, which may explain the lack of germinal centers mentioned above.

T cell specificity

SARS-CoV-2-specific T cells recognizing both spike- and non-spike epitopes have been identified in patients with acute COVID-19 and in convalescent patients through various flow cytometry-based techniques such as intracellular cytokine staining, activation-induced marker assays (including antigen-reactive T cell enrichment) and peptide–major histocompatibility complex (MHC) multimers (Supplementary Table 1), demonstrating the formation of virus-specific memory T cells after infection79,88,91,92,93,94,95,96,97,98. Although patients with severe COVID-19 showed overall higher breadth and magnitude of SARS-CoV-2-specific T cell responses than patients with mild COVID-19, higher proportions of SARS-CoV-2-specific CD8+ T cells were observed in mild versus severe cases97. SARS-CoV-2-specific CD4+ T cells are predominantly TH1 cells and largely display a central memory T phenotype, whereas SARS-CoV-2-specific CD8+ T cells are more enriched in effector memory and terminally differentiated effector subsets95,98,99. scRNA-seq and scTCR-seq analysis of SARS-CoV-2-reactive CD4+ T cells revealed an association between increased SARS-CoV-2-specific cytotoxic CD4+ T cells and cytotoxic TFH cells with disease severity and an inverse association of regulatory T cells with disease severity100. SARS-CoV-2-reactive CD8+ T cells were shown to have enhanced expression of cytotoxic and inflammatory genes with higher levels of TCR clonal expansion in patients with severe disease, supporting an association of overactivated antigen-specific T cells with COVID-19 pathogenesis101. As in vitro stimulation can radically alter the gene expression profiles of reactive T cells, further studies using peptide–MHC multimers together with single-cell sequencing should provide complementary and additional insights into the phenotype and function of SARS-CoV-2-specific T cells. In addition, single-cell multi-omic technologies, such as single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq), could be applied to investigate whether SARS-CoV-2-specific T cells retain epigenetic fingerprints that may dictate their recall responses.

Contribution of T cells to COVID-19 immunity

Many questions remain about how prior exposure to endemic coronaviruses and cross-reactive memory T cell immunity shape the immune response to SARS-CoV-2 (refs. 102,103). Intracellular cytokine staining and flow cytometry, enzyme-linked immune absorbent spot (ELISpot) and FluoroSpot assays have been key single-cell technologies for detecting cross-reactive memory CD4+ T cells in SARS-CoV-2-unexposed individuals (ranging between approximately 20% and 50% of individuals tested across geographically diverse cohorts), whereas cross-reactive memory CD8+ T cells are much less common88,91,92,93,94,95,104. It has been postulated that cross-reactive memory CD4+ T cells may provide protective immunity against SARS-CoV-2 infection and reduce disease severity by promoting B cell and antibody responses or by mediating rapid local antiviral immunity at sites of infection, including the lung and upper respiratory tract103. However, it was reported105 that pre-existing cross-reactive memory CD4+ T cells may not only have low TCR avidity with reduced clonal expansion but also exacerbate inflammation and disease severity, especially in the elderly106. Thus, the roles of pre-existing cross-reactive memory T cells in SARS-CoV-2 infection and vaccination in the general population warrant further investigation.

Correlates of immune protection at the single-cell level

With the rapid growth of single-cell datasets on the immunology of COVID-19, one open question is how to best leverage current and emerging public datasets. Ongoing efficacy trials of candidate COVID-19 vaccines are generating single-cell immune-profiling datasets that could be applied to identifying correlates of risk and correlates of protection against primary and secondary endpoints in these trials, for instance, as we have done previously for the RV144 HIV vaccine efficacy trial105. Because of limited specimen volumes in COVID-19 vaccine efficacy trials and the large number of immunological biomarkers that could potentially be evaluated as correlates, it will be critical to perform pilot studies to optimize and identify assays and associated immune signatures with favorable statistical properties for correlates analyses. Important factors in determining the best correlates include high reproducibility, large dynamic range in vaccine recipients and low response range (low false positive rate) at baseline in vaccine recipients and in placebo recipients after placebo treatment. To help expedite correlates analyses, existing single-cell datasets could be explored to optimize and identify highly reproducible single-cell immunological signatures, which will be useful in variable down-selection.

Another major application of these collated datasets is the potential for integrative analyses and meta-analyses, such as the recent meta-analysis of 107 lung scRNA-seq studies that identified additional proteases that may potentially be involved in SARS-CoV-2 infection107.

Box 1 (together with Figs. 2 and 3) illustrates one way to organize publicly available single-cell data for use in integrative analysis. Readers should note, however, that a search for biomarkers and signatures associated with COVID-19 disease severity or progression using such published data faces several challenges.

Fig. 2: Visual representation of single-cell transcriptomic data.
figure 2

a, Plots showing the distribution of age (left) and sex (right) among individuals included in the collected 21 datasets. b, Bar plot showing the number of samples per tissue type among the collected 21 datasets. c, UMAP plots showing the projection of over 2.5 million single cells from 16 PBMC and whole-blood datasets (the latter without neutrophils or basophils) mapped to the Seurat CITE-seq reference and colored and labeled by reference-defined cell type annotations at level 1 (l1, left), level 2 (l2, middle) and level 3 (l3, right) granularity. CTL, cytotoxic T lymphocyte; eryth, erythrocyte; ILC, innate lymphoid cell; mono, monocyte; Treg, regulatory T cell; ASDC, Axl+ dendritic cell; cDC, conventional dendritic cell; dnT, double-negative T cell; gdT, γδ T cell; HSPC, hematopoietic stem and progenitor cell; mDC, myeloid dendritic cell; TCM, central memory T cell; TEM, effector memory T cell. d, Box plots showing the frequencies of CD4+ T cells, CD8+ T cells, MAIT cells, CD56bright NK cells, CD16+ monocytes and CD14+ monocytes among samples grouped by disease severity and time points for COVID-19 samples. Samples from Arunachalam et al.23 (enriched for DCs), Meckiff et al.100 (enriched for antigen-specific CD4+ T cells), Kusnadi et al.101 (enriched for antigen-specific CD8+ T cells) and Bacher et al.106 (enriched for antigen-specific CD4+ T cells) were excluded. Early, ≤8 d after symptom onset; intermediate, >8 and ≤15 d after symptom onset; late, >15 d after symptom onset. Statistical significance between mild and severe disease was determined by Wilcoxon test. *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001.

Fig. 3: An example screenshot of the visualization portal.
figure 3

The website provides visualization of 21 individual datasets and the merged dataset consisting of 16 datasets. The UMAP plot shows the merged dataset consisting of over 2.5 million cells mapped to the human PBMC CITE-seq reference126. Processed datasets can also be downloaded from the website at https://atlas.fredhutch.org/fredhutch/covid. QC, quality control.

One challenge is whether and how to account for trial-participant demographics, as ethnic108,109,110,111, sex110,112 and age113,114,115,116,117 differences in clinical presentation, immune responses and outcomes of SARS-CoV-2 infection have been identified, even when adjusting for potential confounders. A second major challenge is the substantial heterogeneity thus far in the ordinal scales that have been used for assessing coronavirus disease severity118: of the 44 studies that categorized disease severity, 7 used World Health Organization scores, 5 used National Health Commission of China guidelines, 3 used National Institutes of Health (NIH) scores, 1 used German Robert Koch Institute symptom classification, 1 used National Early Warning Score, 4 did not provide relevant information on how disease severity was defined, and 23 used custom scoring (Supplementary Table 1). Moreover, for studies that obtained samples at ‘early’ and ‘late’ disease stages, there is substantial heterogeneity in how the time after symptom onset is defined. The incorporation of standardized definitions of disease severity into future single-cell immune-profiling studies would facilitate such integrative and meta-analyses. As one approach to overcome this problem, seven disease severity categories have been defined119 and manually assigned standardized categories in an integrated analysis of 4,780 PBMC transcriptomic samples from patients infected with a different virus (16 in total across 26 datasets), an approach that may help mitigate the problem caused by non-standardized definitions of disease severity. Integrative analysis of bulk sequencing data has supported the hypothesis that there is a conserved pan-viral response associated with disease severity: in an integrative analysis120 of three single-cell datasets (two CITE-seq datasets23,120 and one scRNA-seq dataset15; 264,224 cells from 71 PBMC samples overall) from three independent cohorts including healthy controls15,23,120, patients with SARS-CoV-2 infection15,23,120 and patients with influenza or respiratory syncytial virus infection23 reported a ‘meta-virus signature’ score in single myeloid cells that was positively correlated with viral infection severity across different virus strains.

The lack of standardized experimental protocols and analysis pipelines is an additional challenge in integrating single-cell datasets. As a result, published results and datasets can be difficult to compare and integrate due to experimental and computational variation on how the cells and datasets were processed. Fortunately, the research community has worked hard to provide computational approaches that can be used to reduce technical variation and produce standardized cell annotations that can be compared, visualized and modeled across datasets121,122. These approaches have already been used in many of the COVID-19 single-cell studies published to date to correct for batch effects120. Those part of the Human Cell Atlas123 initiative have also been working to standardize the different aspects of single-cell sequencing, and the Human Cell Atlas Data Portal provides publicly available datasets processed by standardized pipelines124.

Conclusions

Single-cell analyses have held up to their promise of overcoming certain limitations of bulk methods and enabling a deep dive into the cellular heterogeneity of antiviral immune responses. Multiple single-cell immune-profiling studies of patients with COVID-19 have identified distinct cell subsets of the innate and adaptive immune systems that correlate with disease severity; this body of evidence supports the hypothesis that such subsets may have important functions in blunting (or even enhancing) COVID-19 disease severity. There is also evidence to suggest that targeting certain immunological factors such as cytokine or chemokine receptors (for example, IL-6R and CCR1) might curb pathogenic responses and improve protective immunity. In the near-term future, studies applying single-cell multi-omic technologies such as CITE-seq and single-cell B cell receptor sequencing (scBCR-seq) or scTCR-seq with peptide–MHC multimers are needed to further characterize the phenotypes and functions of immune cell subsets implicated in COVID-19 protection and of those implicated in progression. Application of multi-omic technologies such as scATAC-seq to understand the epigenetic changes associated with SARS-CoV-2 infection in immune cells, especially in antigen-specific T and B cells, may help identify new avenues to pursue for COVID-19 therapies. Moreover, additional multi-omic spatial immune-profiling studies are needed to further dissect local immune responses against SARS-CoV-2 in infected tissues such as the lung. In the longer term, single-cell immune-profiling studies with sufficiently large sample sizes and participant diversity will be valuable to help investigate potential sex-related and age-related differences in COVID-19, including whether the immune cell subsets of interest described above vary in frequency or in function in a sex-dependent manner. Application of single-cell immune profiling to better understand the mechanisms driving the range of post-COVID-19 conditions (most prominently, long COVID) is also a relatively underexplored area with many unanswered questions.

The scientific community has mobilized in unprecedented fashion in response to the ongoing COVID-19 pandemic. Large collaborative efforts such as the US NIH-funded Immunophenotyping Assessment in a COVID-19 Cohort study (NCT04378777) and the COVID-19 Cell Atlas125 (Wellcome Sanger Institute–Chan Zuckerberg Initiative), to name a few, are generating freely available, open-access datasets at the single-cell level, and work is being done to standardize protocols and metadata as well. These datasets have been derived from patients with COVID-19 of varying severity and include different tissues, cohort features and time points. Looking ahead, we anticipate that the amount and complexity of single-cell datasets will rapidly grow, including in important populations, such as pediatric patients (for whom little single-cell data currently exists) and that reanalyses and meta-analyses will become more common as standards become available.