Single-cell transcriptomic analysis reveals disparate effector differentiation pathways in human Treg compartment

Human FOXP3+ regulatory T (Treg) cells are central to immune tolerance. However, their heterogeneity and differentiation remain incompletely understood. Here we use single-cell RNA and T cell receptor sequencing to resolve Treg cells from healthy individuals and patients with or without acute graft-versus-host disease (aGVHD) who undergo stem cell transplantation. These analyses, combined with functional assays, separate Treg cells into naïve, activated, and effector stages, and resolve the HLA-DRhi, LIMS1hi, highly suppressive FOXP3hi, and highly proliferative MKI67hi effector subsets. Trajectory analysis assembles Treg subsets into two differentiation paths (I/II) with distinctive phenotypic and functional programs, ending with the FOXP3hi and MKI67hi subsets, respectively. Transcription factors FOXP3 and SUB1 contribute to some Path I and Path II phenotypes, respectively. These FOXP3hi and MKI67hi subsets and two differentiation pathways are conserved in transplanted patients, despite having functional and migratory impairments under aGVHD. These findings expand the understanding of Treg cell heterogeneity and differentiation and provide a single-cell atlas for the dissection of Treg complexity in health and disease.

R egulatory T (T reg ) cells are a highly immunosuppressive population of CD4 + T cells characterized by the expression of the transcription factor forkhead box protein P3 (FOXP3). T reg cells control immune responses and maintain peripheral tolerance [1][2][3][4] . T reg cells have been suggested to be heterogeneous. Based on cell origins, T reg cells are divided into two parts. Thymus-derived T reg (tT reg ) cells develop in the thymus and make up the majority of T reg cell pool in secondary lymphoid organs. Peripheral T reg (pT reg ) cells arise from conventional T (T con ) cells at peripheral inflammation sites with the acquisition of FOXP3 expression 5 . Human peripheral blood (PB) CD4 + FOXP3 + T reg cells are subdivided into three subpopulations: CD45RA + FOXP3 lo /CD25 lo resting or naïve T reg cells, CD45RA − FOXP3 hi /CD25 hi effector T reg cells, and CD45RA − FOXP3 lo /CD25 lo cells (not true T reg cells) 6 . Mass cytometry analysis based on 26 well-recognized T reg -associated markers identified 22 subsets 7 . T reg cells can also be divided into several T helper (Th) like subpopulations 8 . However, none of these taxonomies are dependent on the unsupervised global gene expression profile.
Recent advances in single-cell RNA sequencing (scRNA-seq) have shed new light on T cell and T reg cell heterogeneity at singlecell resolution [9][10][11][12][13] . Based on scRNA-seq, T reg cells were subdivided into six clusters in healthy human PB 12 , or five clusters in the human breast cancer microenvironment 10 . The proportions of resting and activated T reg populations in mice were reportedly not determined by T cell receptor (TCR) signaling strength, whereas the intensity of TCR signal intriguingly influenced the phenotypic and functional programs of activated T reg cells 12 . T reg cells also exhibited transcriptional dynamics along a continuum of tissue adaptation and presented conserved expression programs between homeostasis and disease and between mice and human 13 . Despite these findings, in-depth single-cell investigations on human T reg cells during steady state or disease conditions are still limited. The identity, functional/homeostatic characteristics, differentiation, and relationships of distinct T reg subsets remain incompletely understood.
T reg cells are impaired in number or function in inflammatory disorders. Typically, graft-versus-host disease (GVHD) is a major adverse effect of allogeneic hematopoietic stem cell transplantation (allo-HSCT) 14 . GVHD is associated with the decreased number and function of T reg cells 15,16 . Transfer of T reg cells has alleviated GVHD symptoms in mouse models and clinical trials [17][18][19] . However, the mechanisms underlying T reg cell defects in GVHD have not been fully addressed, at least partially due to the lack of single-cell omics analysis.
In this work, we use scRNA-seq and single-cell TCR (scTCR)seq to analyze T reg cells in PB and bone marrow (BM) from healthy donors and allo-HSCT patients with or without acute GVHD (aGVHD). Heterogeneous T reg cell subpopulations are resolved. Their transcriptional signatures, phenotypic markers, and functional and homeostasis programs are defined. In addition, two T reg cell differentiation pathways are identified, and their characteristics and transcription factors are defined. Moreover, T reg cells from allo-HSCT patients with or without aGVHD are also analyzed at the single-cell level, which provide further insight into the conservation and change of T reg cell dynamics under disease conditions.
Results scRNA-seq resolves distinctive subsets among human FOXP3 + T reg cells. We firstly conducted scRNA-seq on T reg cells from healthy donors and allo-HSCT patients with or without aGVHD. scRNA-seq was performed using the 10× Genomics Chromium platform to analyze CD4 + CD25 + CD127 − T reg 20,21 and CD4 + CD25 − conventional T (T con ) cells sorted from PB and BM of healthy donors and allo-HSCT patients with or without aGVHD (Fig. 1a, Supplementary Fig. 1a-c, Supplementary Data 1). Consistent with a previous study 12 , not all CD4 + CD25 + CD127 − cells had FOXP3 reads due to potential non-T reg cell contamination and limited gene coverage in scRNA-seq ( Supplementary Fig. 1d). To reduce the contamination of non-T reg cell populations, FOXP3 + cells were selected for subsequent analyses. The 43,178 FOXP3 + T reg cells and 3,138 CD4 + FOXP3 − T con cells were analyzed and they had an average of 1331 genes per cell ( Supplementary Fig. 1e, f, Supplementary Data 2). The T reg identity of these FOXP3 + cells was confirmed by their enriched expression of canonical T reg marker genes including FOXP3, IKZF2, TIGIT, IL2RA, IL10RA and CTLA4 ( Supplementary Fig. 1g, Supplementary Data 3) 3,4 .
The heterogeneity of healthy donor T reg cells was assessed. Cell clusters were identified based on the shared nearest neighbor (SNN) clustering algorithm in Seurat 22 and visualized through tdistributed stochastic neighbor embedding (t-SNE) or Uniform Manifold Approximation and Projection (UMAP) analysis. Finally, nine clusters in PB and BM T reg cells from healthy donors were defined (Fig. 1b, Supplementary Fig. 2a). According to previously defined markers of naïve and effector T reg cells, including CCR7, HLA-DR, and FOXP3 [23][24][25] , and the top 10 signature genes and signature transcription factors of each cluster (Fig. 1c, d, Supplementary Fig. 2b-f, Supplementary Data 4), clusters 0, 3, and 5 in PB (P0, P3, P5) and clusters 0, 1, and 4 in BM (B0, B1, B4) perfectly matched naïve status with CCR7 hi TCF7 hi HLA-DR low FOXP3 low profile. These clusters were termed naïve subsets. The others were activated/effector subsets. They featured negative or low expression of CCR7. In addition, the partition-based graph abstraction (PAGA) mapping revealed high connectivity among the naïve clusters, and among the activated/effector clusters (Fig. 1e). To better resolve these activated/ effector subsets, we performed a further annotation. Among them, clusters 2 in PB and BM (P2, B2) exhibited high expression of HLA-DR and were named as HLA-DR hi subsets. The expression of LIMS1 was high in clusters 4 and 8 in PB (P4, P8) and clusters 5 and 8 in BM (B5, B8). Among these clusters, P4 and B5 were designated LIMS1 hi subsets. Cluster 7 in PB (P7) and cluster 6 in BM (B6) were highly similar in transcriptome signature and had the FOXP3 hi PTPRC hi DDX17 hi profile. P7 and B6 were designated FOXP3 hi effector subsets. Cluster 8 in PB (P8) and cluster 8 in BM (B8) were very similar, both of them had an MKI67 hi TUBB hi HMGB2 hi profile and the maximal proportion of cells in the S + G2/M phase ( Supplementary Fig. 3a). P8 and B8 were designated MKI67 hi effector subsets. Although the FOXP3 hi and MKI67 hi T reg cell subsets were highly conserved across PB and BM, they also showed tissue-specific gene expression differences (Fig. 1f, Supplementary Fig. 3b, Supplementary Data 5). These differences might have been caused by the distinctive tissue microenvironment. T reg subsets also displayed disparate expression of signature transcription factors (Supplementary Fig. 2f). There were more naïve T reg cells in BM than those in PB, although there was no significant difference ( Supplementary Fig. 3c). The naïve and effector T reg cell markers suggested by T reg cell scRNA-seq studies 12,13 were also enriched in our defined naïve and activated/effector subsets, respectively ( Supplementary Fig. 3d, e). Thus, based on unsupervised clustering, we identified several distinct human T reg cell subsets under steady-state conditions.
To compare the functional/homeostatic features among different clusters, gene set variation analysis (GSVA) 26,27 was performed (Fig. 1g, Supplementary Data 6). GSVA showed a gradual increase in TCR signaling, activation, and suppressive function from naïve to activated/effector subsets. The MKI67 hi subsets had the highest expression of TCR signal, suppression, apoptosis, cell cycle (S, G2/M), glycolysis, and tricarboxylic acid cycle (TCA) gene sets. The FOXP3 hi subsets were characterized by low TCR signal and glycolysis, intermediate suppression, and high fatty acid oxidation (FAO) gene sets. The HLA-DR hi subsets exhibited the highest degree of activation. Further examination of individual genes indicated that suppression and migration genes were selectively expressed by different subsets ( Supplementary  Fig. 3f). Particularly, the HLA-DR hi subset strongly expressed TGFB1, CCR3 and CCR10. The LIMS1 hi subset highly expressed IL10 and CCR9. The FOXP3 hi subset highly expressed IL2RA, CTLA4, CCR4, CXCR4 and SELL. The MKI67 hi subset highly expressed LGALS1 and ITGAE.
scTCR-seq reveals amplification and transition among T reg cells subsets. TCRs are uniquely expressed by individual T reg cell clones 28,29 . To trace the amplification history of a single T reg clone, scRNA-seq and paired scTCR-seq were performed simultaneously in 11,830 sorted FOXP3 + T reg cells and 1,519 sorted T con cells (Supplementary Data 7). This approach allowed direct mapping of gene expression to TCR in the same cell. Most amplified TCR clonotypes (n ≥ 2) were detected in activated/ effector subsets such as HLA-DR hi and LIMS1 hi effector subsets. No or very few cells from naïve, FOXP3 hi , or MKI67 hi subsets harbored clonally amplified TCRs (Fig. 1h). We also identified a total of 250 and 27 TCR clonotypes that were shared by cells from at least two different clusters in PB and BM, respectively (Supplementary Fig. 3g). Most TCRs were shared between activated/ effector subsets, suggesting frequent differentiation between them. Although either FOXP3 hi subset or MKI67 hi subset had shared TCR clonotypes with other subsets, there was no shared clonotype between FOXP3 hi subset and MKI67 hi subset, which might be due to their distinctive development or the limited cell number. The latter would not be sufficient to capture the same TCR (many TCR clonotypes are present in vivo). In addition, 121 TCR clonotypes were shared between PB and BM T reg cells (Supplementary Fig. 3h). The TCRs shared between tissues were presented in most subsets, except P5 (naïve), P3 (naïve), P8 (MKI67 hi ), and B8 (MKI67 hi ). Therefore, P5 and P3 may be at the most naïve status that were only present in PB. MKI67 hi subsets might differentiate independently in PB or BM. However, the lack of shared TCRs between PB and BM MKI67 hi subsets may have been caused by a relatively small cell number of analyzed cells. These results collectively indicated a process of naïve to activated and effector differentiation, defined as the transition between individual T reg subsets in one tissue or between two tissues. However, we could not determine the order of T reg cell development between different tissues. Further studies are necessary.
In vitro assay of T reg subsets. To evaluate the functional and homeostatic characteristics of T reg subsets in vitro, cell surface marker candidates for different clusters were identified (Fig. 2a), and clusters from PB were isolated by flow cytometry sorting (Fig. 2b, Supplementary Figs. 4a, 12a, f). Cells in the FOXP3 hi and MKI67 hi subsets were the largest and displayed the highest expression of FOXP3 and KI67 proteins, respectively (Fig. 2c, d, Supplementary Figs. 4b, 12a). The FOXP3 hi and MKI67 hi subsets exhibited the strongest inhibition of responder T (T resp ) cell proliferation, followed by LIMS1 hi and HLA-DR hi subsets and then by other activated/effector subsets (Fig. 2e, Supplementary  Fig. 12a, b). MKI67 hi and LIMS1 hi subsets proliferated more than other activated/effector subsets after stimulation (Supplementary Figs. 4c, 12a, c). Activated/effector subsets had a slightly higher apoptosis rate than most naïve subsets during culture (Supplementary Figs. 4d, 12a, c). Surprisingly, a large proportion of cultured LIMS1 hi subset cells lost the FOXP3 protein ( Supplementary Figs. 4e, 12a, c). The LIMS1 hi subset rarely expressed Helios (encoded by IKZF2) (Supplementary Figs. 4f,  12a), which is considered a potential marker of thymus-derived T reg cells 30 . Although, the LIMS1 hi subset shared a number of TCRs with activated/effector T reg subsets but not with T con cells ( Supplementary Fig. 4g), it still could not determine whether LIMS1 hi subset were converted from T con cells or not. The collective in vitro findings further defined the functional and homeostatic characteristics of T reg subsets previously resolved with single-cell transcriptomics.
Two effector differentiation paths revealed by pseudotime analysis. To delineate the hierarchy and development relationship between subsets, pseudotime analysis 31 was performed. The findings unexpectedly revealed two effector differentiation pathways (termed Path I and II) in both PB and BM (Fig. 3a). Signature genes for Pre-branch (such as TCF7, EEF1B2, C1orf162, and SNHG7), Path I (such as PTPRC, DDX17, MALAT1, and PDE3B), and Path II (such as HLA-DR, LGALS1/3 and CD74) were identified (Fig. 3b, Supplementary Data 8). Correlation analysis suggested a conservation between PB and BM of Prebranch and Path II, but not of Path I ( Supplementary Fig. 5a). Different clusters merged into a process in pseudotime that started with naïve subsets (mostly in the Pre-branch), followed by activated/effector subsets, and ending with the FOXP3 hi subset (Path I) or MKI67 hi subset (Path II) (Fig. 3a, bottom, Supplementary Fig. 5b). Even cells within the same subsets could be distributed into the two paths, indicating that this bifurcated differentiation is a dominant rule (Fig. 3a, top). There were some overlapping marker genes between the FOXP3 hi subset and Path I, or between the MKI67 hi subset and Path II, consistent with the FOXP3 hi subset as the terminus of Path I and the MKI67 hi subsets as the terminus of Path II ( Supplementary Fig. 5c). Gene Ontology analysis showed that, in both PB and BM, Pre-branchenriched genes were associated with translation and protein targeting to the endoplasmic reticulum (ER), Path I-enriched genes were related to cell-cell adhesion and cell aggregation, and Path II-enriched genes played roles in response to immune system processes (Fig. 3c, Supplementary Fig. 5d, e, Supplementary Data 9). GSVA indicated that Path II cells expressed higher TCR, activation, suppression, migration, apoptosis, S and G2/M gene sets than did Path I cells (Fig. 3d). Path I cells preferentially expressed FAO gene sets, and Path II cells predominantly expressed glycolysis and TCA gene sets (Fig. 3d).
Inspecting of individual genes revealed that the Path I terminus highly expressed the critical membrane-associated suppressor genes IL2RA and CTLA4 and soluble suppressor genes FGL2 and PRF1. The Path II terminus highly expressed the soluble suppressor genes LGALS1, TGFB1, GZMA, IL10, and IL12A and the proliferation genes TOP2A, PCNA, and HMGB2, with a slightly higher proportion of S + G2/M cells (Fig. 3e, Supplementary Fig. 6a).  1 ScRNA-seq and TCR-seq reveals distinctive subsets among human FOXP3 + T reg cells. a A scheme showing the overall strategy of this study. b t-SNE of single-cell transcriptomes of T reg cells from healthy donor (HD) peripheral blood (PB, n = 8) and bone marrow (BM, n = 6) samples, colored by subsets. Each subset was numbered and labeled with the first letter (s) of the tissue name (P0 for cluster 0 in peripheral blood, etc.). c Projection of CCR7, HLA-DRB1, FOXP3, PTPRC and MKI67 expression onto the t-SNE plot. d Heatmaps showing the top 10 (by fold change) marker genes for each subset, excluding the ribosomal and mitochondrial genes. Scaled expression means the gene expression was centered and scaled among subsets. The fold change means the values of normalized expression of genes in a specific subset compared to the normalized expression of genes in the other subsets. e Partition graph abstraction (PAGA) analysis. Nodes represent subsets, and thicker edges indicate stronger connectedness between subsets. f Correlograms visualizing the correlation of single-cell gene expression profiles between subsets from HD PB and HD BM. g Heatmaps showing the GSVA enrichment score of T reg cell feature pathways for each subset. h The amplified (n ≥ 2) TCR distribution of T reg cells across different subsets, colored by TCR clonotypes. The insert pictures show the composition of unique and amplified (n ≥ 2) T reg cell TCR clonotypes in each subset. Source data are provided as a Source Data file. ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-24213-6 TCR analyses revealed that most of the amplified TCR clonotypes were in Path II cells (Fig. 3f, Supplementary Data 7), corroborating the high glycolysis-proliferation gene signature. Shared TCRs were mostly observed between Pre-branch and Path II and between Path II and Path I, indicating a dominant differentiation from Pre-branch to Path II and the transition between Path II and Path I cells (Fig. 3g). A more detailed analysis of TCR overlap indicated that most Path I HLA-DR hi and FOXP3 hi subset of cells might be derived from the Path II HLA-DR hi subset. While Path II MKI67 hi subset did not transit into Path I cells (Fig. 3h), suggesting that the MKI67 hi subset was probably a distinctively developed subpopulation.
In vitro assay of two differentiation Paths. The surface markers of different paths were identified to validate the properties of the two paths ( Supplementary Fig. 6b, c, Supplementary Data 8). In vitro assays were performed after isolation of Path I and II cells by flow cytometry: Pre-branch as CCR7 + , Path I as CCR7 − CCR4 med/hi , Path II as CCR7 − CXCR3 + (Supplementary Figs. 6d, 12a). Path I cells expressed more CD25 and CTLA4 proteins, whereas Path II cells expressed more IL10 and TGF-β1 proteins ( Fig. 4a, b, Supplementary Fig. 12a, d). However, the expression of GZMA, GZMB, and LAP (the N-terminal dimer of latent TGF-β1) 32 proteins were overall very low and only marginally different between Path II and Path I cells (Fig. 4b, c, Supplementary  Fig. 12a, d). Path I cells had stronger suppressive but weaker proliferative capacity than Path II and Pre-branch cells in vitro (Fig. 4d, e, Supplementary Fig. 12a-c). Path II cells displayed lower expression of FOXP3 protein than Path I cells, and were slightly larger in size than Pre-branch and Path I cells (Supplementary Figs. 6e, f, 12a, c). Thus, single-cell transcriptomics in combination with functional assays defined two effector differentiation paths in T reg cells.
Transcription factors enriched in two differentiation paths. To understand the transcription factor basis of bifurcated differentiation, the signature transcription factors were identified for each path in PB T reg cells. TCF7, FOXP3, and SUB1 were the most significant signature transcription factors in Pre-branch, Path I, and Path II cells from PB, respectively (by p value, Fig. 5a, Supplementary Fig. 6g). PB-BM conservations were evident in the enrichment of TCF7 in the Pre-branch and SUB1 and HMGB2 in Path II (Fig. 5a). Single-cell correlation analysis indicated that FOXP3 was preferentially co-expressed with Path I-associated suppressor genes IL2RA, CTLA4, and FGL2 within individual cells, whereas SUB1 was frequently co-expressed with Path IIassociated suppressor genes EBI3, IL12A, IL10, TGFB1, GZMA, GZMB, and LGALS1 and with the proliferative genes MKI67, MCM6, TOP2A, PCNA and CYCLINs in healthy donor PB (Fig. 5b). A similar gene correlation pattern was observed in the BM ( Supplementary Fig. 6h).
To determine whether FOXP3 and SUB1 contribute to the Path I and II features, gain-of-function experiments were performed using T reg cells that overexpressed FOXP3 or SUB1. FOXP3 and SUB1 were successfully overexpressed in T con and T reg cells by infection with recombinant lentivirus (Fig. 5c,   T resp + P0 T resp + P1 T resp + P2 T resp + P4 T resp + P8 T resp + P7 T resp + P6 T resp + P3 T resp   In d, red line indicated that P8 was compared with P5, P3, P0, P1, P6, P2, P4 or P7. e In vitro suppression assay. Tag-it Violet-labeled T resp cells were coculture with different T reg subsets for 96 h, with CD3/CD28 T cell activator (n = 3). The T reg : T resp ratio was 1: 2. The Tag-it Violet dilution of T resp cells was assessed by flow cytometry. Blue line indicated that P7 was compared with P5 or P3; orange line indicated that P7 was compared with P0 or P6; red line indicated that P8 was compared with P5 or P3; green line indicated that P8 was compared with P0 or P6. In c-e, experiment repeated at least three times; p values were determined by One-way ANOVA, part of the statistical significance was shown in the pictures and the whole ANOVA results were given in Supplementary Data 11; data are presented as mean values ± SEM. Source data are provided as a Source Data file.   Supplementary Figs. 7a-d, 12d, e). FOXP3 overexpression increased IL2RA, CTLA4, FGL2, PRF1, and IL12A mRNA levels. SUB1 overexpression increased IL10, HMGB2, and MKI67 mRNA levels in T reg cells (Fig. 5e, Supplementary Fig. 12e). The expression of some genes was confirmed at the protein level. FOXP3 overexpression slightly increased the level of CD25, CTLA4, and FGL2 proteins in T reg cells. SUB1 overexpression increased the level of IL10 protein and slightly increased KI67 protein T reg cells ( Fig. 5f-j, Supplementary Fig. 12d, e). The levels of PRF1 and IL12 p35 were overall low in T reg cells and not significantly affected by FOXP3 overexpression (Supplementary Figs. 7e, f, 12d, e). FOXP3 overexpression also significantly promoted the RNA and protein levels of CD25 and CTLA4 in T con cells, consistent with a previous report 33 ( Supplementary  Figs. 7g, h, 12d, e). Some other genes that changed after FOXP3/ SUB1 overexpression in T con cells were not significantly altered in T reg cells ( Supplementary Figs. 7g, 12e), suggesting that these transcription factors may have cell type-dependent functions. These results collectively indicated the possible contribution of FOXP3 and SUB1 to the features of the two paths. However, the changes in some genes were slight after FOXP3 and SUB1 overexpression, suggesting that other transcription factors may also be involved, which warrant further study.
T reg subsets in allo-HSCT patients with or without aGVHD. T reg cells are largely regenerated after allo-HSCT, but their function and number are impaired if complicated with aGVHD ( Supplementary Fig. 8a) 15,[34][35][36] . To understand the disturbance of T reg subsets and pathways under this pathological condition, scRNA-seq was used to explore FOXP3 + T cells from the PB and BM of allo-HSCT patients with or without aGVHD (Supplementary Data 1). Chimerism analyses showed early reconstitution of donor T cells (> 96%, Supplementary Data 1). T reg subsets were resolved and compared with healthy donor T reg atlas, which revealed that naïve subsets (non-aGVHD P2, non-aGVHD B3), FOXP3 hi subsets (non-aGVHD P4, aGVHD P6, non-aGVHD B6, aGVHD B4) and MKI67 hi subsets (non-aGVHD P5, aGVHD P7, non-aGVHD B5, aGVHD B5) were present in allo-HSCT patients regardless of primary diseases (Fig. 6a-c, Supplementary  Fig. 8b-d, Supplementary Data 4). Other effector clusters in allo-HSCT patients were not highly correlated with any effector clusters in healthy donors, indicating that dramatic changes occurred in effector subpopulations after allo-HSCT. Compared with healthy donors, allo-HSCT patients without aGVHD displayed reduced naïve T reg cells, whereas aGVHD patients almost completely lost naïve T reg cells (Fig. 6a-c, Supplementary Fig. 8e).
In the BM, the MKI67 hi subset was expanded up to 3% in non-aGVHD patients (non-aGVHD B5), but was decreased to < 1% (aGVHD B5) in aGVHD patients (Fig. 6a). These results indicated that FOXP3 hi and MKI67 hi subsets still preserved their identity, but other effector T reg cells had undergone dramatic alterations after allo-HSCT. GSVA analysis of T reg subsets indicated that the effector T reg subsets from non-aGVHD patients had slightly higher expression of suppression-associated genes than those from healthy donors (Fig. 6d). Even though FOXP3 hi or MKI67 hi subsets seemed to be comparable between non-aGVHD and aGVHD conditions, other effector T reg cells from PB and BM of aGVHD patients displayed lower expression of suppression-, migration-, and TCR-associated gene sets, compared with the non-aGVHD patients (Fig. 6d). In the BM of aGVHD patients, the effector T reg subsets (excluding the FOXP3 hi and MKI67 hi subsets) displayed a senescence-like defect characterized by high expression of natural killer cell receptors (NKRs) and lower expression of CD27/CD28 [37][38][39] (Fig. 6d). These defects were not associated with an exhausted phenotype (Fig. 6d). As the aGVHD patients in this study were relatively older than non-aGVHD patients ( Supplementary Fig. 8f), the senescence-like defect might be attributed to the age of aGVHD patients.   Examination of individual genes related to suppression and migration revealed that effector T reg cells (excluding the FOXP3 hi and MKI67 hi subsets) from non-aGVHD patients had higher levels of IL2RA, ENTPD1, EBI3, LGALS1, and CCR1/3/5 than with those from healthy donors (Supplementary Fig. 9). FOXP3 hi or MKI67 hi subsets displayed few genes that were differentially expressed between non-aGVHD and aGVHD conditions (Fig. 6e). Effector T reg cells from aGVHD patients had lower levels of NT5E, CTLA4, CCR1, CCR7, CCR9, and CXCR6 than those from non-aGVHD patients (Fig. 6e). In addition, lower levels of other genes such as FGL2, IL2RA, PRF1, TGFB1, IL10, CCR2-6/8/10, CXCR3, and ITGAE in BM effector T reg cells were found in aGVHD patients than in non-aGVHD patients (Fig. 6e), indicating severe defects in BM T reg cells. These results suggested that the FOXP3 hi and MKI67 hi subsets were stable populations in allo-HSCT patients regardless of aGVHD, whereas the other effector subsets underwent dramatic changes in frequency, signature, and function under allo-HSCT and aGVHD conditions. T reg paths in allo-HSCT patients with or without aGVHD. T reg cells from allo-HSCT patients preserved the two differentiation paths, mostly with the FOXP3 hi subset at the Path I terminus (except non-aGVHD BM), and with the MKI67 hi subset at the Path II terminus (Fig. 7a, Supplementary Fig. 10a, Supplementary  Data 8). Allo-HSCT patients also expressed high levels of suppressor genes LGALS1 and cell cycle genes TOP2A, PCNA, and MCM6 in Path II, and expressed high levels of suppressor genes IL2RA, CLTA4, and FGL2 in Path I, similar to healthy donor T reg cells ( Supplementary Fig. 10b). GSVA demonstrated that in allo-HSCT patients, Path II still had higher TCR signaling, suppression, migration, cell cycle, apoptosis, glycolysis, and TCA than Path I, similar to the observations in healthy donors (Fig. 7b). These results suggested that the two-path differentiation modes were preserved in allo-HSCT patients regardless of aGVHD.
CXCR6, and lower levels of CCR7, in Pre-branch, Path I, and Path II. These findings might suggest a general T reg functional activation after allo-HSCT ( Supplementary Fig. 10c). GSVA analysis revealed that, compared with non-aGVHD patients, aGVHD patients displayed senescence-like defects and reduced suppression and migration capacity (Fig. 7b) in both Path I and Path II. Examination of the individual genes revealed that, compared with non-aGVHD patients, aGVHD patients expressed lower levels of CCR7 and CXCR6 in Pre-branch, lower levels of CTLA4, PRF1, GZMA, TGFB1, CCR3, CCR5, CCR9, CCR10, CXCR3, and CXCR6 in Path I, and lower levels of CTLA4, ENTPD1, GZMA, CCR1/2/3/5/7/9, CXCR3, CXCR6, SELL, and ITGAE in Path II (Fig. 7c). These results suggested that aGVHD T reg cells have displayed path-specific defects in suppression and migration.

Discussion
Multi-faceted investigations of T reg heterogeneity have revealed diverse classifications, which largely improved the knowledge of T reg cells and potential clinical benefits. The present single-cell transcriptome-based exploration of human T reg cells from healthy donors and stem cell transplanted patients corroborates previous findings and brings some fresh insights into T reg cell biology. We resolved healthy donor PB and BM T reg cells into nine subsets continuously spanning naïve and activated/effector stages. Among them, FOXP3 hi and MKI67 hi subsets were highlighted. The FOXP3 hi subset had the highest expression of FOXP3 and IL2RA. The MKI67 hi subset was characterized by high proliferationassociated genes. FOXP3 hi and MKI67 hi subsets had the strongest suppressive capacity, suggesting that they might be important in vivo despite their limited numbers. Although the FOXP3 hi subset did not exhibit a remarkably higher score than other effector subsets in GSVA analysis based on all suppressor genes, it expressed the highest levels of IL2RA and CTLA4 (in PB), which are two critical mediators of T reg suppressor function 3 and might account for its superior suppressive capacity. Compared with the highly conserved FOXP3 hi and MKI67 hi subsets, the overall signature of the HLA-DR hi or LIMS1 hi subsets were less conserved between PB and BM. The HLA-DR hi subset displayed the highest degree of activation, which has been described previously 40 . The LIMS1 hi subset lost FOXP3 expression when cultured in vitro, but this result could be more or less affected by the presence of FOXP3-negitive cells in this sorted T reg cell subset ( Supplementary  Fig. 4a). Unexpectedly, in our study, LIMS1 hi cells still retained strong suppression ability (Fig. 2e), probably due to FOXP3independent mechanisms or because the residual FOXP3 + cells can still exert a strong suppressive effect. The LIMS1 hi subset somehow resembled the inducible murine T reg cells mentioned in a previous study 41 . The findings of a recent study that used scRNA-seq to map umbilical cord blood T reg cells in an inflammatory setting and suggested that the TIGIT − subset was sensitive to IL6 and developed an "unstable" T reg identity 42 . This subset is similar with the LIMS1 hi subset in terms of low expression of TIGIT (Supplementary Fig. 11a). However, there was no evidence in the current study concerning whether the human LIMS1 hi cells were induced or unstable T reg cells. The relationship between our LIMS1 hi subset and the previously described TIGIT − subset in literature awaits future study. In a previous study using mass cytometry, 22 T reg subpopulations were identified in the  Pre  I  II  Pre  I  II   PB  BM   HD  nG  aG  HD  nG  aG  HD  nG  aG  HD  nG  aG  HD  nG  aG  HD  nG  aG Senesence-like and functional defects in aG, compared with nG  PB of human 7 . The CD45RA + CCR4 − CD31 + , CXCR3 + CD38 low ICOS low , and CXCR3 − CD38 + subpopulations in their study were similar to our naïve subsets (P0, 3, 5), P6 effector subset, and P4 effector subset, respectively ( Supplementary Fig. 11b). However, as mRNA levels may not always be consistent with protein levels, a very strict comparison is difficult. Future analysis of protein and mRNA in the same cells may clarify the relationship between the identified subpopulations in different studies.
Our results also reveal two disparate differentiation pathways in T reg cells. Path I had high expression of CD25 and CTLA4, and genes associated with FAO and was termed as the FOXP3 hi subset. In contrast, Path II had high expression of IL10 and TGF-β1, and genes associated with glycolysis and proliferation and was termed as the MKI67 hi subset. Intriguingly, even cells within the same effector cluster could be divided into two paths, suggesting that the bifurcated differentiation is a dominant rule that governs T reg cells. Single-cell TCR analysis can provide in-depth information about T reg cell differentiation. A previous single-cell analysis indicated the close relationship of mouse T reg cells with shared TCRs 12 . Presently, high-frequency TCR clonotypes were mainly detected in activated/effector T reg cells, suggesting that the acquisition of the activation/effector phenotype requires clonal expansion. Effector T reg subsets had the highest numbers of overlapping TCR clonotypes, suggesting their frequent transition. Surprisingly, the FOXP3 hi and MKI67 hi subsets had hardly any overlapping clonotypes, and the MKI67 hi subset did not contain repeated TCRs. This may reflect their distinct developmental mode or be attributed to the few cells that were collected. Overlapping TCR clonotypes were abundant between Pre-branch and Path II and between Path II and Path I but were rare between Pre-branch and Path I, suggesting a probable differentiation route from Pre-branch to Path II and the transition between Path II and Path I cells. However, we should not rule out the possibility of differentiation from Pre-branch to Path I. We also explored the transcription factors that may contribute to Path I or Path II phenotype. Notably, the Path I cells displayed high expression of FOXP3, whereas Path II cells displayed high expression of SUB1. Overexpression of FOXP3 and SUB1 slightly but repeatedly promoted the expression of some Path I-and Path II-associated suppressive genes, respectively, suggesting they may be involved in promoting Path I and Path II phenotypes. However, after FOXP3 and SUB1 overexpression, the gene expression changes were mostly slight, suggesting that the Path I and Path II phenotypes may also be regulated by many other transcription factors that are enriched in Path I and Path II cells. This needs to be further investigated.
T reg cell differentiation has been extensively explored. A recent report indicated that human T reg cells may simulate Th cell differentiation mechanisms and can be divided into Th1, Th17, Th1/ 17, Th2, and Th22-like subpopulations 8 . Our data suggested that the FOXP3 hi subset expressed some genes associated with T reg 2like cells, whereas the MKI67 hi subset expressed some genes associated with T reg 1-like cells (Supplementary Fig. 11c). However, other subsets had more complex features and did not fit well with obvious Th-like gene expression patterns. In addition, the two paths of differentiation revealed by our trajectory analysis did not show a tendency to express a specific Th-like signature ( Supplementary Fig. 11d). For example, TBX21 and GATA3 were expressed at higher levels in Path I, while IFNG was higher in Path II. Additional studies are needed to explore this important issue. A previous study has suggested that T reg cells from nonlymphoid organs differ from those in lymphoid organs and exhibited tissue-adapted gene signatures 13 . The present findings identified differences in gene patterns between PB and BM T reg cells. Future studies on other tissues will further resolve the tissue-specific differentiation of T reg cells.
T reg cell heterogeneity and differentiation were also studied in the context of allo-HSCT. FOXP3 hi and MKI67 hi T reg subsets and the two differentiation pathways were still present in allo-HSCT patients regardless of aGVHD. However, other effector subsets from aGVHD patients displayed a senescence-like phenotype, and a decreased expression of suppression and migration-related molecules. Our results are consistent with previous microarray analysis data suggesting reduced expression of migratory and suppressive genes, including CCR1, CCR3, CCR5, CXCR3, CXCR6, LGALS1, and GZMA, in aGVHD patients 43 . With the caveat of the examination of a limited number of specimens, the average age of aGVHD patients was higher than that of non-aGVHD patients, although the average age of donors was comparable. A previous study described that the combination of donor and recipient age is a predictor of GVHD 44 . The senescence-like defect of T reg cells from aGVHD patients might be at least partially attributed to the older age of aGVHD patients compared with non-aGVHD patients.
In conclusion, with an unbiased single-cell approach, our study reveals previously unrecognized effector subsets and two-path differentiation structure in the human T reg compartment. These aspects are conserved between PB and BM and between steady state conditions and immune disturbance. The features related to function, migration, metabolism, activation, proliferation, TCR signaling, and transcription factors in different subsets and pathways are illustrated. We also suggest the transcription factor FOXP3 and SUB1 might be involved in promoting some features of the two-paths, although future study on the complete list of differentially expressed transcription factors will provide more complete understanding of two-path differentiation. These findings enrich the knowledge of human T reg cell heterogeneity, function, and differentiation, and provide a single-cell resolution atlas that will inform the greater understanding of T reg cells and T reg cell-related diseases and therapeutic interventions in these diseases.

Methods
Human specimens. This study was approved by the Ethics Committee of the State Key Laboratory of Experimental Hematology, Institute of Hematology and Hospital of Blood Disease, Chinese Academy of Medical Sciences & Peking Union Medical College, Tianjin, China (approval number: KT2017069-EC-2). All the people in this study provided written informed consent for sample collection and data analyses. The inclusion criteria of patients were allogeneic stem cell transplantation, donor cell chimerism of CD3 + cells > 95%, and no active infections (i.e., cytomegalovirus or hepatitis B virus). Nine HDs, six non-aGVHD patients and six aGVHD patients were enrolled in this study. aGVHD was staged according to modified Glucksberg criteria 45 . Their ages ranged from 12 to 58, with a median age of 32. All HSCT patients received aGVHD prophylaxis with tacrolimus or ciclosporin plus short-term methylprednisolone, with or without ruxolitinib. PB samples with/without paired BM samples were obtained for the subsequent lymphocyte isolation. Patient characteristics are given in Supplementary Data 1.
Specimen preparation of single-cell suspensions. PB and BM were collected, and coagulation was prevented by the addition of 50 U/ml heparin (Sigma-Aldrich, MO, USA). Mononuclear cells from PB or BM were isolated using Lymphoprep (STEMCELL, Vancouver, Canada) according to the manufacturer's instructions. Lysis of red blood cells was performed with 500 μl of ACK (Lonza, NJ, USA) for 5 min on ice. These cells were resuspended in sorting buffer PBS supplemented with 1% fetal bovine serum (FBS, Gibco, CA, USA). Suspensions were passed through a 70 μm cell strainer before immunostaining. Integrated analysis of different conditions 10× Genomics-derived data. To account for batch effect among different samples in each condition (HD-BM, HD-PB, Non-aGVHD-BM, Non-aGVHD-PB, aGVHD-BM, aGVHD-PB), we used "FindIntegrationAnchors" in the Seurat package to remove batch effect and merge samples in each condition to one object. In detail, the top 2 000 genes with the highest expression and dispersion from each sample were used to find the integration anchors, and then the computed anchoret was applied to perform dataset integration.
Identification and analysis of differentially expressed genes. To identify unique differentially expressed genes (DEGs) among each cluster, the "FindAll-Markers" function from Seurat was used and non-parametric Wilcoxon rank sum tests were set to evaluate the significance of each individual DEG. The DEGs with adjusted P value less than 0.05 were thought to be significant and used in downstream analysis. Then, significant genes were selected as input to perform gene ontology analysis through DAVID (https://david-d.ncifcrf.gov, version 6.8) 47 .
Clustering, heatmaps and dot plots for gene expression in single cells. Hierarchical clustering and heatmap generation were performed for single cells on the basis of normalized expression values of marker genes curated from the literature or identified significant DEGs. To visualize the expression of individual genes, cells were grouped by their cell type as determined by analysis with Seurat. Normalized gene expression values were plotted for each cell type as a heatmap or dot plot in R.
PAGA analysis. To assess the global connectivity topology between the T reg cell clusters we applied Partition-based graph abstraction (PAGA) 48 . The weighted edges represent a statistical measure of connectivity between the partitions. Connections with a weight less than 0.3 were removed.
Functional enrichment analysis of signature genes. Gene set variation analysis (GSVA) analysis was performed to identify the pathway alterations that underlie our T reg cell subsets with the Bioconductor package GSVA (version 1.16.0). The expression matrix of T reg cells were subjected to the GSVA algorithm to calculate GSVA enrichment scores for each gene set, and the gene sets are listed in Supplementary Data 6.
Pseudotime trajectories analysis. Pseudotime trajectories were constructed with the R package Monocle (version 2.12.0) 49 , ordering genes were identified by the "differentialGeneTest" function, and adjusted P value less than 0.001 were regarded as significant and used to order cells. The discriminative dimensionality reduction with trees (DDRTree) method was used to reduce data to two dimensions and visualized through the "plot_cell_trajectory" function. To detect genes that play essential roles in cell fate decisions, branched expression analysis modeling (BEAM) from Monocle was implemented to identify genes with branch-dependent expression and visualized with the "plot_genes_branched_heatmap" function.
Preprocessing and analysis of scTCR-seq data. TCR sequence data from Chromium single cell 5′ RNA-seq libraries were processed by CellRanger (version 3.0.2) with default parameters. To compare the TCR data among different samples, multiple libraries were analyzed together according to the documentation provided by 10× Genomics (https://support.10xgenomics.com/single-cell-vdj/software/ pipelines/latest/advanced/multi-library-samples). First, we constructed a mro file and checked by "cellranger mrc". Then, "cellranger vdj" was used to align TCR data to human Cell Ranger V(D)J compatible reference (http://cf.10xgenomics.com/ supp/cell-vdj/refdata-cellranger-vdj-GRCh38-alts-ensembl-3.1.0). The number of distinct UMIs aligned to each TCR alpha/beta pair less than 10 were filtered out, and only the productive TCR alpha/beta pairs were kept for further analysis. Finally, we identified the TCR alpha/beta pairs for 13,349 cells.
In vitro T reg suppression assay, cell lineage maintenance and survival. Based on FACS analysis, DAPI − CD4 + CD25 − CD44 − CD62L + T cells were sorted and used as responder cells (T resp ). For in vitro suppression assay, T resp cells (4 × 10 4 ) were labeled with Tag-it Violet TM proliferation and cell tracking dye kit (Biolegend) and cultured in the presence or absence of FACS-isolated T reg cells from distinct clusters or different path cells with the ratio 1: 2 (T reg : T resp ) for 96 h, with CD3/CD28 T cell activator (3 µl/ml, STEMCELL). The cell division index of T resp cells was assessed by dilution of Tag-it Violet, using FlowJo software. The calculation formula to determine the suppression ability of T reg in vitro is: Suppression (%) = (Percentage of proliferating T resp cells alone -Percentage of proliferating T resp cells treated with T reg )/ Percentage of proliferating T resp cells alone × 100 50,51 . The in vitro FOXP3 stability, proliferation capacity and apoptosis of T reg cells were assessed after 96 h in vitro culture (3 µl/ml CD3/CD28 T cell activator, 20 ng/ml IL-2) by the expression of FOXP3, the dilution of Tag-it Violet and the expression of Annexin V, respectively.