The development of the digestive tract is critical for proper food digestion and nutrient absorption. Here, we analyse the main organs of the digestive tract, including the oesophagus, stomach, small intestine and large intestine, from human embryos between 6 and 25 weeks of gestation as well as the large intestine from adults using single-cell RNA-seq analyses. In total, 5,227 individual cells are analysed and 40 cell types clearly identified. Their crucial biological features, including developmental processes, signalling pathways, cell cycle, nutrient digestion and absorption metabolism, and transcription factor networks, are systematically revealed. Moreover, the differentiation and maturation processes of the large intestine are thoroughly investigated by comparing the corresponding transcriptome profiles between embryonic and adult stages. Our work offers a rich resource for investigating the gene regulation networks of the human fetal digestive tract and adult large intestine at single-cell resolution.
Access optionsAccess options
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by grants from the National Natural Science Foundation of China (31625018, 31230047, 31601177, 81521002), the Ministry of Science and Technology of China (2017YFA0102702), a General Financial Grant from the China Postdoctoral Science Foundation (2017M610703) and Shanghai Science and Technology Development Funds (16YF140940).
Integrated supplementary information
(a) Dot plot showing expression level of well-defined cell type marker genes in each group. (b) t-SNE showing the organ source of each group. In a and b, n = 4,089 cells. (c) Boxplot showing the expression levels of erythrocyte-specific markers between group 31 (n = 103 cells) and the control group (n = 54 cells). The black central line is the median, the boxes indicate the upper and lower quartiles, the whiskers indicate the 1.5 interquartile range. (d) t-SNE showing the expression patterns of muscle-specific marker gene MYLPF. n = 4,089 cells. (e) The correlation coefficient of four tissues at each stage. The result showed that the early stage organs were more similar to each other than the late stage. The correlation values are derived from Pearson pairwise correlation. 6W: n = 607 cells, 7W: n = 774 cells, 8W: n = 319 cells, 9W: n = 328 cells, 11W: n = 232 cells, 14W: n = 224 cells, 16W: n = 195 cells, 19W: n = 257 cells, 21W: n = 367 cells, 24W: n = 624 cells, 25W: n = 162 cells. The black central line is the median, the boxes indicate the upper and lower quartiles, the whiskers indicate the 1.5 interquartile range. (f) The analysis of cell cycle for the cells inside the circle (Groups 33-44, 48-51) and the cells outside of the circle (Groups 1-33 and 48) base on expression of sets of G1/S and G2/M genes derived from the scRNA-seq. The result showed that the proliferation index for the cells inside the circle was higher than that of the cells outside the circle according to the percentage of S + G2/M.
Supplementary Figure 2 Pseudotime construction in each organ and global expression patterns of selected cell-type-specific markers.
(a) Pseudotime construction of the four gastrointestinal tract organs. The cells are colored according to their actual developmental stages. (b-e) Smoothed spline showing the expression patterns of stem cell-, cell cycle- and organ-specific genes along the pseudotime for each organ. Colored lines and shaded regions represent the expression tendency of each gene and s.e.m., respectively. For example, the epithelial markers EPCAM and CDH1 showed relatively stable expression throughout the development of the human fetal esophagus (b). In the stomach from 6W to 25W, the expression levels of both the mucous neck cell marker TFF2, and the recently identified novel multipotent mammary stem cell marker PROCR, were gradually increased. (f) The Western blotting results show that the protein level of MUC2 was gradually increased during the development of S-Intes and L-Intes, while the protein level of OLFM4 was initially increased and subsequently decreased during the development of the S-Intes. The S-Intes and L-Intes of 7W, 14W and 24W embryos were used for Western blotting experiments. The western blotting experiments were independently repeated four times with similar results. The unprocessed gel data are provided in Supplementary Fig. 9. (g) After normalization, the pseudotimes for the four organs at the same time scale; the smoothed-line shows the cell number density along the normalized pseudotime in each organ. A majority of the cells in the esophagus and stomach reached relative mature state at the mid-developmental stage. However, the small and large intestines gradually matured throughout the entire developmental period from 6W to 25W. The color represents the organ source. (h) Line plot showing the actual stage information of cells along the normalized pseudotimes. (i) Analysis pipeline of the cell-type definition for each organ. To further resolve the cellular heterogeneity among these four organs and better identify the cell types, we combined StemID and K-nearest neighbor (KNN) in the subsequent analysis in each organ. For a-e and g-h, Esophagus: n = 773 cells, Stomach: n = 854 cells, S-Intes: n = 868 cells, L-Intes: n = 849 cells.
(a) Boxplot shows the expression levels of cluster-specific genes and known esophageal cell-type markers. Different colors represent different cell types. Ciliated epithelial markers: NME5 and DNAI1. Basal cell markers: KRT6A and KRT6B. Secretory cell markers: MUC1 and MUC16. Cluster 1: n = 26 cells, Cluster 2: n = 18 cells, Cluster 3: n = 415 cells, Cluster 4: n = 314 cells. (b) Averaged expression levels of ANXA1 among the esophagus (n = 773 cells), stomach (n = 854 cells), S-Intes (n = 868 cells) and L-Intes (n = 849 cells). error bars, s.e.m.; *p<2.2×10-16;. (c) Immunostaining of ANXA1 of the esophagus at three different developmental stages (7W, 14W and 23W). White triangle indicates ANXA1+ cells. Dashed lines indicate the structure of the squamous epithelial layer. (d) Bar plot showing the expression levels of ANXA1 and EPCAM in all esophagus cells at single-cell resolution. The cells are ordered according to their AXNA1 and SOX2 expression levels. Early: n = 500 cells, Mid-: n = 138 cells, Late: n = 135 cells. (e) Immunostaining of ANXA1 in the 23W S-Intes. White triangle indicates ANXA1+ cells. Dashed lines indicate the villus. (f) The epithelial cells in the esophagus (n = 773) showed high ANXA1 expression, whereas in the S-Intes (n = 868) showed only marginal levels of ANXA1 expression (NS, not significant; *P<0.01). The precise p value from left to right — < 2.2×10-16, < 2.2×10-16, < 2.2×10-16, 0.2107, 0.0309. (g) The ANXA1 expressions between S-Intes endothelial cells (n = 18) from Groups 23-25 identified in the global t-SNE analysis and S-Intes epithelial cells (EPCAMHigh, n = 18) were compared. The endothelial cells in the S-Intes showed high ANXA1 expression, whereas the epithelial cells in S-Intes showed essentially no ANXA1 expression (***P<0.01). The precise p value from left to right — 2.611×10-7, 0.0009123, 7.832×10-6, 2.008×10-5. In a, f and g, the black central line is the median, the boxes indicate the upper and lower quartiles, the whiskers indicate the 1.5 interquartile range. In b, f and g, two-sided Mann-Whitney U test were used. (h) Immunostaining of PECAM1 and ANXA1 in the 24W S-Intes. In c, e and h, Scale bars, 25μm. The experiment was independently repeated twice with similar results.
Immunostaining of well-known cell type specific markers in 7W and 24W S-Intes. Enterocyte markers (CDH1, VIL1), Goblet cell markers (MUC2, TFF3), Endocrine cell markers (CHGA, PYY), Tuft maker and stem cell marker (DCLK1), Paneth cell marker (LYZ), Stem/progenitor cell markers (OLFM4, SOX9). Interestingly, the VIL1 and CDH1, two well-known enterocyte markers, showed distinct expression patterns in the 7W fetal S-Intes. VIL1 was expressed in columnar epithelial cells, whereas CDH1 was expressed in a minor proportion of the cells located in the putative mesenchymal layer. Moreover, MUC2 and TFF3 were primarily expressed in the villus whereas the SOX9 and OLFM4 were specifically expressed in the crypts. Scale bars, 25μm. All experiments were independently repeated twice with similar results.
(a) Immunostaining of well-known cell-type markers in the 7W and 24W fetal L-Intes and adult L-Intes. Enterocyte markers (CDH1 and VIL1), goblet cell markers (MUC2 and TFF3), endocrine cell markers (CHGA, PYY), stem/progenitor cell markers (OLFM4 and SOX9), tuft marker and stem cell marker (DCLK1), and Paneth cell marker (LYZ). Scale bars, 25μm. Unexpectedly, the fetal L-Intes also expressed LYZ, which was previously reported to exist only in the Paneth cells of S-Intes. All experiments were independently repeated a minimum of twice with similar results. (b) Immunostaining of TFF3 and CHGA in the 24W L-Intes, 24W S-Intes and adult L-Intes. White triangle indicates TFF3+ CHGA+ cells. Scale bars, 25μm. All experiments were independently repeated twice with similar results. (c) Heatmap showing the expression levels of cell cycle-related genes in each L-Intes clusters. The cell cycle index is shown at the bottom of the heatmap. n = 849 cells.
Supplementary Figure 6 Cell-Type identification of adult large intestine and comparison between fetal and adult stages.
(a) Heatmap showing the averaged expression level of well-known cell-type markers and selected signaling pathway-related genes in the twelve clusters of adult L-Intes. The colors from blue to red represent the expression level from low to high. n = 1,303 cells. (b) A summary of the features of all adult large intestine cell types, including cell number, cell cycle index and cell type markers. Newly identified markers are shown in red, and the well-known markers are shown in black. (c) PCA of all stages of fetal L-Intes cells and all adult L-Intes cells (n = 2,218 cells).
(a-c) Clustering of gene expression tendencies along the developmental stages of the esophagus, stomach and large intestine. Solid black and colored lines represented the expression tendencies of all genes and each gene, respectively. Esophagus: 7W (n = 168 cells), 14W (n = 65 cells), 19W (n = 57 cells); Stomach: 7W (n = 233 cells), 14W (n = 37 cells), 24W (n = 98 cells); L-Intes: 7W (n = 118 cells), 14W (n = 57 cells), 19W (n = 53 cells), 24W (n = 188 cells). (d) Schematic diagram of the interactions of epithelial and mesenchymal cells. (e) Bar plot showing S-Intes cell expression profiles of Hedgehog and other signaling pathway related genes at single-cell levels. Cells were ordered by the VIM expression level. The results showed the Hedgehog signaling-related genes (PTCH1, SMO and GLI2/3) tend to be expressed by the mesenchymal cells (VIMHigh cells) but not the epithelial cells, and the SFRP1 is indeed expressed by the mesenchymal cells that highly expressed Hedgehog signaling-related genes. Moreover, TGF-β signaling pathway-related genes ZEB1 and ZEB2 are highly expressed by the mesenchymal cells, which barely express CDH1. In addition, we observed that FGFR3 tend to be expressed by the epithelial cells, whereas FGFR1 tend to be expressed by the mesenchymal cells. n=868 cells. (f) Boxplot showing the expression levels of selected genes in S-Intes epithelial cells (EPCAMHigh cells) and S-Intes mesenchymal cells (VIMHigh cells). The top 50 cells highly expressing EPCAM and the top 50 cells highly expressing VIM were used for analysis. The black central line is the median, the boxes indicate the upper and lower quartiles, the whiskers indicate the 1.5 interquartile range.
Supplementary Figure 8 Transcription factor network analysis during fetal esophagus, stomach and L-Intes development.
(a, d and g) Top candidate master transcription regulators during esophagus, stomach and L-Intes development from early stage to mid-stage. Esophagus (early stage: n = 500 cells, mid- stage: n = 138 cells, late stage: n = 135 cells). Stomach (early stage: n = 586 cells, mid- stage: n = 75 cells, late stage: n = 193 cells). L-Intes (early stage: n = 481 cells, mid- stage: n = 95 cells, late stage: n = 273 cells). The violin plots show the expression levels of each master transcription regulator. The density of violin plots were scale to maximum of 1 by setting ‘scale = area’ and all violins have the same maximum width. (b, e and h) Top candidate master transcription regulators during esophagus, stomach and L-Intes development from mid-stage to late stage. The violin plots show the expression levels of each master transcription regulator. The density of violin plots were scale to maximum of 1 by setting ‘scale = area’ and all violins have the same maximum width. (c, f and i) Transcription factor correlation network during fetal esophagus, stomach and L-Intes development. Nodes (TFs) with more than three edges are shown, with each edge representing a high correlation (>0.3) between the related TFs. The correlation values are derived from Pearson pairwise correlation. Yellow, orange and red circles represent the highest average expression levels of genes in early, mid- and late stages, respectively. Esophagus: n = = 773 cells, Stomach: n = 854 cells, L-Intes: n = 849 cells.
Unprocessed blots for related figures in Supplementary fig. 2f