Immune cell profiling of COVID-19 patients in the recovery stage by single-cell sequencing

COVID-19, caused by SARS-CoV-2, has recently affected over 1,200,000 people and killed more than 60,000. The key immune cell subsets change and their states during the course of COVID-19 remain unclear. We sought to comprehensively characterize the transcriptional changes in peripheral blood mononuclear cells during the recovery stage of COVID-19 by single-cell RNA sequencing technique. It was found that T cells decreased remarkably, whereas monocytes increased in patients in the early recovery stage (ERS) of COVID-19. There was an increased ratio of classical CD14++ monocytes with high inflammatory gene expression as well as a greater abundance of CD14++IL1β+ monocytes in the ERS. CD4+ T cells and CD8+ T cells decreased significantly and expressed high levels of inflammatory genes in the ERS. Among the B cells, the plasma cells increased remarkably, whereas the naïve B cells decreased. Several novel B cell-receptor (BCR) changes were identified, such as IGHV3-23 and IGHV3-7, and isotypes (IGHV3-15, IGHV3-30, and IGKV3-11) previously used for virus vaccine development were confirmed. The strongest pairing frequencies, IGHV3-23-IGHJ4, indicated a monoclonal state associated with SARS-CoV-2 specificity, which had not been reported yet. Furthermore, integrated analysis predicted that IL-1β and M-CSF may be novel candidate target genes for inflammatory storm and that TNFSF13, IL-18, IL-2, and IL-4 may be beneficial for the recovery of COVID-19 patients. Our study provides the first evidence of an inflammatory immune signature in the ERS, suggesting COVID-19 patients are still vulnerable after hospital discharge. Identification of novel BCR signaling may lead to the development of vaccines and antibodies for the treatment of COVID-19.


Introduction
COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has spread in many countries [1][2][3] . As of April 6, 2020, SARS-CoV-2 has affected over 1,200,000 people and killed more than 60,000 of those affected in more than 160 countries. Following its global spread, the World Health Organization declared it a public health emergency of international concern 4 . COVID-19 shows symptoms of fever, dry cough, fatigue, diarrhea, conjunctivitis, and pneumonia. Some patients develop severe pneumonia, acute respiratory distress syndrome (ARDS), or multiple organ failure [5][6][7] . Although scientists and clinicians worldwide have made great efforts to produce vaccines and explored antiviral drugs 8,9 , there is still no specific medicine and highly effective clinical treatment for COVID-19 10,11 .
Immune system dysregulation, such as lymphopenia and inflammatory cytokine storm, have been observed and are believed to be associated with the severity of pathogenic coronavirus infections, such as severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) infections 12,13 . With regard to COVID-19, recent studies also found decreases in lymphocyte numbers and increases in serum inflammatory cytokine levels in peripheral blood 5,14 . However, the manner in which key immune cell subsets change and their states during COVID-19 have remained largely unclear. Thus, defining these key cellular subsets and their states in COVID-19 is a crucial step in obtaining critical insights into the immune clearance mechanism and developing new therapeutic strategies for COVID-19.
Here, we applied single-cell RNA sequencing (scRNAseq) to comprehensively characterize the changes in peripheral blood mononuclear cells (PBMCs) from 10 COVID-19 patients. Our study depicted a high-resolution transcriptome landscape of blood immune cell subsets during the recovery stage of COVID-19. It revealed that, compared to that in the healthy controls (HCs), monocytes containing high inflammatory gene expression and IL1β + subsets predominated, whereas CD4 + T cells decreased remarkably in patients in the early recovery stage of COVID-19. We found that T and B cell clones were highly expanded during the recovery stage in COVID-19 patients. Furthermore, several specific BCR changes in COVID-19 patients during the recovery stage may be helpful for vaccine and antibody production.

Study design and analysis of single immune cell profiling in COVID-19 patients
To map the immune microenvironment of COVID-19 patients, we identified mirroring changes in the blood and pinpointed cell-specific alterations associated with disease severity and recovery; we then integrated scRNA-seq, single-cell paired BCR, and single-cell paired TCR analysis from a total of 10 COVID-19 patients in the early recovery stage (ERS) or late recovery stage (LRS) (70,858 PBMCs). We also collected scRNA-seq data (57,238 cells) from five healthy donors as controls ( Fig. 1a and Supplementary Fig. S1a-c). This dataset passed stringent high-quality filtering. Single-cell suspensions of the scRNA-seq samples were converted to barcoded scRNAseq libraries using 10x Genomics. Cell Ranger software (version 3.1.0) was used for the initial processing of the sequencing data.
Using t-distributed stochastic neighbor embedding (t-SNE), we analyzed the distribution of the three immune cell lineages, myeloid, NK and T, and B cells, based on the expression of canonical lineage markers and other genes specifically upregulated in each cluster (Fig. 1b, c). For marker genes, expression values in each cell positioned in a t-SNE are shown in Fig. 1d. We next clustered the cells of each lineage separately and identified a total of 20 immune cell clusters.

An overview of NK and T, B, and myeloid cells in the blood of convalescent patients with COVID-19
The immune cell compartment of patients who have recovered from COVID-19 infection comprised all major immune lineages. We analyzed 128,096 scRNA-seq profiles that passed quality control, including 36,442 myeloid cells, 64,247 NK and T cells, and 10,177 B cells from five HCs, five ERS, and five LRS patients. The sketchy clustering analysis landscape of each subject is presented in Supplementary Fig. S2a, and the merged image of each group is shown in Fig. 2a. We discovered that COVID-19 patients, including ERS and LRS, demonstrated a higher proportion of myeloid cells compared to the HCs, but with a lower proportion of NK and T cells (Fig. 2b, c). Interestingly, LRS patients had more B cells and NK and T cells, but less myeloid cells, than the ERS patients (Fig. 2b, c). Thus, these findings indicated that COVID-19 patients had decreased lymphocyte counts and increased counts of myeloid cells in peripheral blood.
To further understand the changes in the myeloid, NK and T, and B cells in COVID-19 patients, we conducted differential expression gene (DEG) analysis of the NK and T, B, and myeloid cells between the HCs and patients. The heatmaps are shown in Fig. 2d-f. Inflammatory cytokines and chemokines such as IL1B, CCL3, IRF1, DUSP1, JUN, and FOS were all expressed at high levels in patients, regardless of myeloid cells (Fig. 2d), NK and T cells (Fig.  2e), or B cells (Fig. 2f).
Collectively, our results demonstrated that myeloid cells increased, whereas NK and T cells decreased in the peripheral blood of COVID-19 patients and that the immune cell compositions differed between the patients in the ERS and LRS.
Notably, IL1β expression values in a UMAP with simultaneous contrast indicated that IL1β was upregulated in the ERS group and decreased in the LRS patients (Fig. 3f). This was also confirmed in the DC cluster of the ERS group compared to that of the HCs (Supplementary Fig.  S3a, b). Next, we took the average of the inflammatory genes for each myeloid cell scRNA-seq subset in the COVID-19 patients versus that in the HCs (Supplementary Fig. S3c). These results demonstrated that cytokine activation drives the expansion of monocyte populations (especially CD14 ++ inflammatory monocytes) in COVID-19-infected patients. To explore the biological significance of the transcriptional changes in the M1 cluster, we performed GO analysis with DEGs (Fig. 3g). We observed enrichment of the pathways related to cytokine signaling and inflammation activation, which were driven by the upregulation of IFITM3 and IFI6 and IL1β, JUN, FOS, JUNB, and KLF6 (Fig. 3g).
Collectively, these findings demonstrate that a dysregulated balance in the monocyte populations in ERS patients is manifested by substantially increased classical CD14 ++ monocytes. Our results suggest that the classical CD14 ++ monocytes increase in circulation to fuel inflammation during SARS-CoV-2-infection.
The composition of the T and NK cell subsets differed significantly among the HCs and COVID-19 patients (Fig.  4c). The absolute number of CD8 + T cells especially the effector memory CD8 + T cell subgroup and NK cells decreased in COVID-19 patients, whereas the relative ratio of NK cells in ERS was higher than that in the HCs. The ratio of CD4 + T cells was stable, but the composition of the CD4 + T cell subset differed significantly between the HCs and COVID-19 patients. Among CD4 + T cells, the ratio of central memory CD4 + T cells was significantly higher, whereas the ratio of naïve CD4 + T cells, Tregs and effector memory CD4 + T cell was lower than that in the HCs especially in ERS group. Notably, genes associated with CD4 + T cells had relatively high expression levels of inflammation-related genes and were significantly upregulated in the COVID-19 patients (Fig. 4d). CD4 + T cells had high expression levels of inflammatory genes, including FOS, JUN, KLF6, and S100A8 in patients in the ERS of COVID-19 (Fig. 4e). In contrast, antiinflammatory genes associated with CD4 + T cells were downregulated in COVID-19 patients relative to that in the HCs (Fig. 4d, e). This suggested that CD4 + T cells were the main participants in the virus infection. Comparison of the DEGs in the CD4 + T cells revealed the enrichment of genes participating in the cytokine pathway and inflammation activation, including IFITM3 and IFI6 and IL1B, JUN, FOS, JUNB, and KLF6 (Fig. 4f). Further studies are needed to elucidate the IFN pathways involved in COVID-19 pathogenesis. TCR-seq analysis showed that T cell expansion was obviously decreased in the ERS group than in the HC group (Fig. 4g). Moreover, naïve or central memory T cells showed little clonal expansion, while effector memory T cells, terminal effector CD8 + T cells (CTLs), and proliferating T cells showed higher expansion levels ( Fig. 4h). In addition, the most highly expanded (maximum) clone in the ERS group was TRAV8-6-TRAJ45: TRAV7-8-TRBJ2-1 ( Supplementary Fig. S5d). The decreased ratio of CD8 + T cells in COVID-19 patients may implicate the role of CD8 + T cells in virus clearance (Fig. 4c). Moreover, the CD8 + CTL with expanded clones also exhibited overactivated inflammation and antiviral activity compared to those in HCs ( Fig. 4i and Supplementary Fig. S4b). Together, these findings show that clonally expanded CD8 + T cells in the peripheral blood of COVID-19 patients help control the virus. We also performed DEG analysis via Seurat FindAllMarkers analysis and found similar results in T prol cells ( Supplementary  Fig. S4c). Next, we took the average of inflammatory genes for each NK and T cell subset scRNA-seq subset in the COVID-19 patients versus normal RNA-seq data (Supplementary Fig. S4d).
In comparison with that in the HCs, the percentage of plasma cells increased significantly in COVID-19 patients, whereas naïve B cells decreased significantly in the COVID-19 patients (Fig. 5c). Memory B cells and plasma cells (MPB) might play an important role in the control of viral infection and the development of adoptive immunity as they synergistically work and induce specific antibodies. Moreover, compared to that in the HCs, B cell activation-related genes, including S100A8, IGLL5, SSR3, IGHA1, XBP1, and MZB1 were primarily expressed in the MPB of the ERS group (Fig. 5d). We also found similar results in the plasma cells, the antibody-secreting cells (ASC) ( Supplementary Fig. S5b, c), suggesting a key role for ASC in viral control. Next, we took the average of the inflammatory genes for each B cell subset of the COVID-19 patients versus the normal RNA-seq data (Fig. 5e). The difference in the genes between the ERS and HCs indicated enhanced B cell reaction and antibody secretion in COVID-19 patients. GO analysis revealed that IGHA1, XBP1, MZB1, JUN, POLR2L, and ZFP36 were overpresented in MPBs, which suggests enhanced B cell proliferation and viral transcription in COVID-19 patients (Fig. 5f). Single-cell BCR-seq analysis indicated that the IgA isotype was over-represented in COVID-19 patients compared to that in the HC (Fig. 5g). This corresponded with an increase in the levels of serum IgA, which was also pronounced in other coronavirus infections. Moreover, the ratio of (IgA+IgG+IgE) to (IgD+IgM) increased significantly in the ERS patients and showed a downward trend with recovery time (Fig. 5h).

Expanded BCR clones and biased usage of VDJ genes observed in COVID-19 patients
Using sc-BCR-seq to assess the status of clonal expansions in the blood of patients, we found that IL4R + naïve B cells showed little clonal expansion, whereas CD27 + CD38 + memory B cells showed the highest expansion levels among diverse B cell subsets (Fig. 6a). At the individual level, we found that COVID-19 patients had significantly expanded clones compared to that in the HCs, supporting the assumption that B cells had experienced unique clonal VDJ rearrangements under SARS-CoV-2-infection. We also found that a higher B cell clonality consistently remained in the ERS compared with that in the LRS patients (Fig. 6b). Moreover, quantification of the most highly expanded (maximum) clone for each subject showed that the ratios of the maximum clones were higher in the ERS group than in the HCs (Fig. 6c). To understand the functional status of expanded cloned B cells, we performed DEG analysis between the cloned memory B cells and the other B cells. Our results revealed increased expression of B cell genes, including  CD27, SSR4, IGHG1, MZB1, and XBP1, which further supports the superior effector functions of the expanded cloned B cells (Fig. 6d). Moreover, the differential genes for expanded B cells significantly subsided over time and reduced in LRS patients (Fig. 6d).
In summary, an increase in clonality in COVID-19, which was dominated by the IgA and IgM isotypes, together with a skewed use of the IGHV gene, suggested the contribution of SARS-CoV-2 to pathogenesis. Notably, the biased usage of dominated IGV genes, especially the IGHV3-23 and IGHV3-7 in COVID-19 patients, provides a framework for the rational design of SARS-CoV-2 vaccines.

Cell-to-cell communication among immune cells in COVID-19 patients
An established computational approach 17 was used to predict cell-to-cell interactions that may contribute to the distinct functional state of T cells, B cells, monocytes, and dendritic cells (DCs) in ERS and LRS (Fig. 7a, b). In ERS COVID-19 patients, we found adaptive signals involved in monocyte activation, proliferation, and inflammatory signaling (Fig. 7a, b). T cells expressed genes encoding ligands of TNFSF8, LTA, IFNG, IL17A, CCR5, and LTB to TNFRSF8, TNFRSF1A/TNFRSF14, IFNGR1, IL-17RA, CCR1, and LTBR, which were expressed on monocytes and could contribute to the pro-inflammatory status. Other T cell-monocyte interactions involved the expression of CSF2 and CSF1. T cells might activate monocytes through the expression of CSF2 and CSF1, which bind to CSFRs (CSFR2/1) and contribute to inflammatory storm. A cluster of CD14 + monocytes exclusively expressed IL1β, which was predicted to bind to IL1RAP expressed by T cells. T cell-monocyte interaction may enhance immune response and be exclusive to COVID-19 patients (Fig. 7a, b). Furthermore, we found that monocytes highly expressed the poliovirus receptor, which serves as a cellular receptor for poliovirus in the first step of poliovirus replication and induction of the NF-kappa B signaling pathway. From the B cell-monocyte and B cell-T cell interactions, we found that B cells could secrete a large number of IL-6, LTA, and LTB, which are combined with IL-6R, LTAR, and LTBR expressed in monocytes, and a large amount of IL-6 was applied to T cells to promote the secretion of IFN-γ, IL-1β, and other inflammatory cytokines and chemokines. Thus, a cascade signature of inflammatory monocytes with high expression of IL-6 and their progeny were formed in the peak incidence of ERS COVID-19 patients (Fig. 7c). These activated immune cells may enter the circulation in the lung and other organs in large numbers and play an immune-damaging role. In LRS COVID-19 patients, DC ligands were predicted to interact with B and T cell receptors involved in cell proliferation and the production of antibodies. We discovered that the peripheral blood of LRS patients contains a diversity of antibodies; we found that IL18-IL18RAP, TNFSF13-TNFRSF13B, TNFSF13-TNFRSF17, TNFSF13B-TNFRSF17, TNFSF13B-TNFRSF13B, and TNFSF13B-TNFRSF13C were highly expressed in our analysis of DC-B cell interaction (Fig. 7d). Thus, we speculate that DCs produce IL-18, TNFSF13, and TNFSF13B to promote the proliferation of B cells and then secrete many antibodies into the blood in ERS. From the DC-T and T cell-B cell interactions, we discovered (see figure on previous page) Fig. 3 Myeloid cell subsets and their states in the blood of convalescent patients with COVID-19. a Six clusters of myeloid cells were displayed according to marker gene expression levels. Uniform manifold approximation and projection (UMAP) presentation of the heterogeneous clusters of peripheral myeloid cells. Classical CD14 ++ monocytes (M1), non-classical CD16 ++ (FCGR3A) CD14 −/+ monocytes (M2), intermediate CD14 ++ CD16 + monocytes (M3), CD1C + cDC2 (M4), CLEC9A + cDC1 (M5), and pDC (CLEC4C + CD123 + ) (M6). b The UAMP plot shows subtype-specific marker genes of myeloid cells, including CD14, FCGR3A, CD1C, CLEC9A, CLEC4C, and IL-1β. c Bar chart of the relative frequencies of the six sub-clusters of myeloid cells and three sub-clusters of monocytes across the HCs and the ERS and LRS patients. d The heatmap shows the top DEGs between COVID-19 patients and HCs in CD14 ++ monocytes. e Volcano plot of fold change between COVID-19 patients and HCs in CD14 ++ monocytes. P values were calculated using a paired, two-sided Wilcoxon test and FDR corrected using the Benjamini-Hochberg procedure. f The UAMP plot shows that IL-1β was highly expressed in the ERS patients vs. the LRS patients and HCs in myeloid cells. g GO BP enrichment analysis of the DEGs of CD14 ++ monocytes upregulated in COVID-19 patients. P value was derived by a hypergeometric test. UMAP_1   UMAP_2   NK1   NK2   T1   T2   T3   T4   T5   T6   T7  T8   CD4 CD8A that DCs produce not only IL-18 but also IL-7 to promote the proliferation of T cells; moreover, T cells produce IL-2 to promote the proliferation and antibodies production of B cells (Fig. 7d). Thus, cell-to-cell interactions help us to understand why COVID-19 patients manifested high rates of monocytes and low rates of lymphocytes and why the proportion of lymphocytes gradually increased in the peripheral blood of recovering patients.

Discussion
The clinical presentation of COVID-19 varies from asymptomatic to severe ARDS. This has been similarly observed in severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV), and influenza infections 12,14 . In viral infection, it is generally accepted that host immune responses determine both protection against viral infections and the pathogenesis of respiratory injury 18,19 . A coordinated response in innate and adaptive immune cells working in concert may lead to the rapid control of the virus, whereas a failed immune response might lead to viral spreading, cytokine storm, and a high mortality rate 20 . Despite belonging to same group of viruses, recent studies have highlighted differences between COVID-19, SARS, and MERS, such as the speed of transmission, treatment scheme, and mortality rate. Moreover, this difference may also exist in the key immune players and the underlying molecular mechanisms related to these diseases. The lack of knowledge regarding the immune impact of COVID-19 has now become a critical issue in view of its rapid spread and the shortage of specific therapy 21 . Using single-cell sequencing, we profiled the complexity of immune populations in the blood and analyzed 70,858 cells from 10 patients. We identified a hyper-inflammatory response in ERS patients, which may explain why some patients fell sick after being discharged, and suggest that the current criteria for hospital discharge should be re-evaluated. In addition, we identified unique signatures of myeloid, NK and T, and B cells and pinpointed the changes in the epitopes of TCR and BCR. Our findings helped elucidate the antiviral immune mechanisms and revealed promising opportunities for developing immunotherapies using vaccines and neutralizing antibodies.
Inflammation is a vital part of the immune system's response to COVID-19 invasion; previous and latest studies have reported significantly higher levels of inflammatory cytokines associated with disease severity in SARS, MERS, and COVID-19 patients 22,23 . Among the various inflammatory cells, monocytes and their subsets (including classical, intermediate, and non-classical monocytes) may play a critical role because they are known to fuel inflammation [24][25][26][27] . In our study, compared with the HCs, ERS patients demonstrated a significantly higher ratio of monocytes, and these cells expressed higher levels of inflammatory genes. Intriguingly, the ratio of classical CD14 + monocytes was high in ERS but remained normal in LRS. Furthermore, CD14 + IL1β monocytes, which were absent in HCs, could be observed in ERS, and they declined in number in LRS. Notably, our cell-to-cell interaction analysis indicated that IL1β, CSF1, IL6, and CSF2 may be associated with cytokine storm. The CD14 ++ IL1β subpopulation appeared to be part of the inflammatory landscape of COVID-19, since these cells increased in ERS stage. Virus-induced IL-1β production in monocytes is mediated via a caspase-1 pathway. Multiple microbial components, including viral RNA, are thought to trigger assembly of the inflammasome and consequent caspase-1 activation 28 , which may give a reasonable explanation for the presence of IL1β + monocytes in ERS patients. Previous studies with a pathogenic influenza A virus revealed impaired neutrophil and CD4 + T cell activation in IL-1R1 −/− mice, greatly diminished lung inflammatory infiltrates, reduced IgM levels in both serum and at mucosal sites and decreased activation of CD4 + T "helpers" in secondary lymphoid tissue 29 , indicating IL1β is responsible for virus-induced lung immunopathology establishment. CD14 + IL1β is also expected to become an important detection marker for monitoring COVID-19 disease recovery. Collectively, our data provide important (see figure on previous page) Fig. 4 Characterization of T and NK cell responses in the blood of recovered COVID-19 patients. a Ten sub-clusters of NK and T lymphocytes were identified. The UMAP plot shows the clustering of T and NK cells. CD56 + CD16 -NK cells (NK1), C56 -CD16 + NK cells (NK2), naïve CD4 + T cells (T1), central memory CD4 + T cells (T2), effector memory CD4 + T cells (T3), regulatory T cells (T4), naïve CD8 + T cells (T5), effector memory CD8 + T cells (T6), cytotoxic CD8 + T cells (T7), and proliferating T cells (T8). b UAMP plot showing subtype-specific marker genes of NK and T cells including CD4, CD8A, NCAM1, CCR7, GZMK, GNLY, MKI67, FCGR3A, and IL-1β. c The bar plot shows the percentages of four sub-clusters of NK and T cells, four sub-clusters of CD4 + T cells, and three sub-clusters of CD4 + T cells among the HCs and the ERS and LRS patients. d Heatmap of CD4 + T cells showing the DEGs between the COVID-19 patients and HCs. e The volcano plot shows the DEGs of CD4 + T cells between the COVID-19 patients and HCs. P values were calculated using a paired, two-sided Wilcoxon test and FDR corrected using the Benjamini-Hochberg procedure. f GO BP enrichment analysis of the DEGs of CD4 + T cells upregulated in the COVID-19 patients. P value was derived by a hypergeometric test. g The pie plot shows the TCR clone differences across the HCs and the ERS and LRS patients. h UAMP shows expanded TCR clones (n ≥ 2) in the ERS and LRS patients. i The volcano plot shows the DEGs of CD8 + CTLs between the COVID-19 ERS group and HCs. P values were calculated using a paired, two-sided Wilcoxon test and FDR corrected using the Benjamini-Hochberg procedure.  insights into the role of monocytes in the immunopathogenesis of COVID-19. The adaptive immune system harbors the ability to recognize and remember specific pathogens through antibody and T cell responses 30 . Inducing adaptive immunity is the aim of vaccination 31 . Previous SARS studies have identified binding and neutralizing antibodies elicited by SARS-CoV infection. Their therapeutic effect is unclear 32 , although robust antibody responses could be induced 33 . In COVID-19 infection, although several lines of evidence have consistently indicated a decline in lymphocyte counts, the distinct immune characteristics at single-cell resolution are unclear. Irani thevarajan et al. 34 reported that in the blood of mild-to-moderate COVID-19 hospitalized patients, the antibody-secreting cells, follicular T-helper cells, activated CD4 + and CD8 + T-cells and IgM/IgG SARS-CoV-2-binding antibodies were increased by using flow cytometry, and they found the changes persisted for at least 7 days following full resolution of symptoms. Our scRNA-seq analysis showed that, compared with the HCs, ERS patients who recovered less than 7 days have a lower ratio of T and NK cells, and these patients' T cells express higher levels of inflammatory genes, such as JUN, FOS, JUNB, and KLF6. Several studies have reported that lymphopenia is a prominent part of SARS-CoV2 infection and lymphocyte counts are useful in predicting the severity and clinical outcomes [35][36][37] . We found the number of NK and T cell decreased but no significant change in B cell in COVID-19 patients especially in ERS patients. Possible reasons for it may be the direct infection of lymphocytes by SARS-CoV2, cytokinemediated lymphocyte trafficking in the infected tissue or lymphocytes exhaustion in the peripheral blood and sequestration in the lung induced by cytokine storm 35 . There may also be immune-mediated lymphocyte destruction, bone marrow or thymus suppression, or apoptosis 38 , as is reported in other virus infection, which requires further study. In addition, high-throughput TCR sequencing identified expanded T cell clones in ERS patients. In LRS patients, the immunophenotype was different. In particular, LRS patients would have an increase in T and NK cells, with a lower expression of inflammatory genes. We also performed a detailed analysis of B cells in patients and identified a higher population of plasma cells than that in the HCs. We found that BCR contained highly expanded clones, indicating their SARS-CoV-2 specificity. Importantly, we found several loci unique to COVID-19 infection. The strongest pairing frequencies, IGHV3-23-IGHJ4, indicated a monoclonal state associated with SARS-CoV-2 specificity, which has not been reported yet. Notably, numerous studies have reported biased usage of VDJ genes related to virus-specific antibodies. For example, IGHV3-30 and IGKV3-11 have been involved in encoding primary antibodies to neutralize human cytomegalovirus 39,40 . In addition, IGHV3-30 and IGHV3-21 have been utilized to isolate influenza virus antibodies and used for the production of virus vaccines 41,42 . Moreover, a recent study demonstrated that antibodies combining the IGHV3-15/ IGLV1-40 segments had superior neutralizing activities against the Zaire Ebola virus 43 . In addition, we observed lower expression of inflammatory genes in ERS patients than in the HCs. We envision that our results will provide direction for the development of vaccines and antibodies for COVID-19 patients.
Interaction between immune cells may help expedite or defer recovery from COVID-19 infection. Our cell-to-cell prediction analysis utilizing scRNA-seq data indicated that, in ERS patients, B cell-derived IL-6, T cell-derived CSF1 (M-CSF), and CSF2 (GM-CSF) may promote monocyte proliferation and activation. As a result, monocytes may produce a larger number of inflammatory mediators, including IL-1β and IL-6, contributing to inflammatory storm. In LRS patients, both DCs-derived TNFSF13 and IL-18 and T cell-derived IL-2, IL-4 may promote B cell survival, proliferation, and differentiation. Consequently, B cells produce numerous SARS-COV-2specific antibodies to clear viruses, which is in agreement with the research of Zhou Y, et al. 44 who reported CD4 + T cells are activated into T-helper (Th) 1 cells and generate GM-CSF etc. to induced inflammatory CD14 + CD16 + monocytes with high expression of IL-6 and accelerates the inflammation after 2019-nCoV infection.
The immune system comprises a network of cells, tissues and organs that mediate host defense against pathogens. Immune cells can be classified into distinct types based on specific surface markers with the aid of flow cytometry and microscopy. However, not all immune cell types can be completely addressed by a separate analysis of phenotypic markers, as many markers are expressed by multiple cell lineages or are regulated differently during inflammation. In recent years, sequencing technology has been widely used in biological research. On this base, scRNA-seq is used in immunological research to seek to address previously unrecognized cellular heterogeneity, and to reveal key pathways in gene regulatory networks that predict immune function 45 . In the present study, we applied single-cell technology to comprehensively characterize transcriptional changes in peripheral blood mononuclear cells during the recovery stage of COVID-19. scRNA-seq is a powerful tool to identify novel cell subsets during disease progression. In our study, CD14 + IL1β subpopulation was mapped using this method, which uncovered the originator cells in ERS patients. In conclusion, our study provided the first immune atlas of patients who have recovered from COVID-19 and identified adaptive immune dysregulation after discharge. The clonal expansion of both T and B cells indicated that the immune system has gradually recovered; however, the sustained hyper-inflammatory response for more than 7 days after discharge suggested the need for medical observation after patients are discharged from hospital. Longitudinal studies of recovered patients in a larger cohort might help to understand the consequences of the disease. The novel BCRs identified in our study may advance our understanding of B cell mechanisms and have potential clinical utility in COVID-19 immunotherapies.

Patients
Ten COVID-19 patients diagnosed with by real-time fluorescent RT-PCR were collected in the Wuhan Hankou Hospital China. Patients were divided into early recovery stage (ERS) group and late recovery stage (LRS) group according to the days from first negative nucleic acid transfer date to blood sampling date. We defined the ERS group of five cases as the date of nucleic acid turning negative to blood sampling is less than seven days and LRS group of five cases as is more than fourteen days. The 10 patients consisted of five males and five females and ranged from ages 30-80 years old, with a median of 58 years old in ERS, a median of 49 years old in LRS and a median of 55 years old in heathy controls (HCs). No significant differences were detected between HCs, ERS group and LRS group. The demographic characteristics of these patients and HCs are provided in Supplementary Fig. S1a-c. A written informed consent was regularly obtained from all patients. The study was approved by the Ethics Committee of Wuhan Hankou Hospital, China.

Quantitative reverse transcription polymerase chain reaction
The throat swab, sputum from the upper respiratory tract and blood were collected from patients at various time-points after hospitalization. Sample collection, processing, and laboratory testing complied with WHO guidance. Viral RNA was extracted from samples using the QIAamp RNA Viral Kit (Qiagen, Heiden, Germany) according to the manufacturer's instructions. SARS-CoV-2-infected patients were confirmed by use of qRT-PCR kit (TaKaRa, Dalian, China) as recommended by China CDC.

Single-cell collection and scRNA-seq
The peripheral blood mononuclear cell (PBMCs) were isolated from heparinized venous blood of patients or healthy donors using a Ficoll-Hypaque density solution according to standard density gradient centrifugation methods. For each sample, the cell viability exceeded 80%.
The single-cell suspensions of scRNA-seq samples were converted to barcoded scRNA-seq libraries using the Chromium Single Cell 5′ Library, Gel Bead and Multiplex Kit, and Chip Kit (10x Genomics). The Chromium Single Cell 5′ v2 Reagent (10x Genomics, 120237) kit was used to prepare single-cell RNA libraries according to the manufacturer's instructions. The FastQC software was used for quality check. The Cell Ranger software (version 3.1.0) was used for initial processing of the sequencing data.

ScRNA-seq data alignment and sample aggregating
We de-multiple and barcode the sample by using The Cell Ranger Software Suite (Version 3.1.0) (https:// support.10xgenomics.com) and with command cell ranger count. After getting each sample gene counts, and aggregate them together. Finally, gene-barcode matrix of all ten patients and five HCs was integrated with Seurat v3 46 (https://satijalab.org/) and monocle3 47

Dimensionality reduction and clustering analysis
We handle the data with Log normalize before cluster and reduction, scale data with top 5000 most variable genes by using FindVariableFeatures function in R package Seurat v3. Clustering and dimensionality method mainly used in monocle3 package. For quality control, the genes used in PCA analysis have eliminated mitochondria (MT), and ribosomes (RPL and RPS) genes including MT-ND3, MT-ATP8, RPS15A, RPS28, RPS21, RPS27, RPS29, RPL36, RPL34, RPL37, RPL38, RPL39, RPL26 and et al. with 50 principal components, and then aligned together, followed by UMAP and t-SNE are both used after the results of the aligned. We used the default parameters with a shared nearest neighbor parameter optimized for each combined dataset inside Monocle3.

Differential analysis for clusters
Seurat package FindAllMarkers in Seurat v3 was used to perform differential analysis between the control and disease groups of the same cell type, the function parameters we used in Seurat v3 are default. For each cluster, differentially expressed genes (DEGs) were generated relative to all of the other cells.

Gene functional annotation
Gene ontology, gene-set enrichment analysis and KEGG pathway analyses from DEGs were performed using Metascape webtool 48 (www.metascape.org), which supports statistical analysis and visualization of functional profiles for genes and gene clusters.

TCR and BCR V(D)J sequencing and analysis
Full-length TCR/BCR V(D)J segments were enriched from amplified cDNA from 5′ libraries via PCR amplification using a Chromium Single-Cell V(D)J Enrichment kit according to the manufacturer's protocol (10x Genomics). The TCR/BCR sequences for each single T/B cell were assembled by Cell Ranger vdj pipeline (v3.1.0), leading to the identification of CDR3 sequence and the rearranged TCR/BCR gene. Analysis was performed using Loupe V(D)J Browser v.2.0.1 (10x Genomics) (https:// support.10xgenomics.com). In brief, a TCR/BCR diversity metric, containing clonotype frequency and barcode information, was obtained. Using barcode information, T/B cells with prevalent TCR/BCR clonotypes were projected on a t-SNE plot.

Cell-cell interaction analysis
The cell-cell interaction analysis was based on the expression of immune-related receptors and ligands.
The potential ligand-receptor interaction between one set of ligand-expressing cells and another set of receptorexpressing cells was calculated as the average of the product of ligand and receptor expression (respectively, from set one and two) across all single-cell pairs: where I is the interaction score between ligand-expressing cells in set one and receptor-expressing cells in set two, Ii is the ligand expression of cell i in cell set one, rj is the receptor expression of cell j in cell set two, n is the number of cells in set one and m is the number of cells in set two 49 .
The gene list contained 168 pairs of well-annotated receptors and ligands, including cytokines, chemokines and co-stimulators. We estimated the potential interaction between two cell types mediated by a specific ligandreceptor pair by the product of the average expression levels of the ligand in one cell type and the corresponding receptor in the other cell type.

Data availability
The accession numbers for the sequencing raw data and processed data in this paper are GSA (Genome Sequence Archive in BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences): HRA000069 and EGA: EGAS00001003449, respectively.