The mammalian kidney has evolved to provide critical adaptive regulatory mechanisms, such as the excretion of waste, and the maintenance of water, electrolyte and acid–base homeostasis to the body. These functions require the coordinated development of specific cell types within a precise three-dimensional pattern. Defects in kidney development are amongst the most common malformations at birth. Congenital anomalies of the kidney and urinary tract (CAKUTs) represent more than 20 percent of birth defects overall1, and they account for a large fraction of chronic kidney disease and renal failure in children2. For example, the number of nephrons formed at birth is thought to be an important determinant of renal function, because reduced nephron numbers are often observed in humans with primary hypertension and chronic kidney disease3,4. An estimated 37 million people in the United States (~ 15% of the population) have chronic kidney disease (CKD)5,6 that can lead to kidney failure requiring transplant or dialysis. Development of strategies to enhance renal repair or regeneration are needed to reduce the morbidity and mortality associated with kidney disease, and they are dependent on a better understanding of the molecular genetic processes that govern kidney development.

Nephrons form the functional units of the kidney and are derived from a nephron progenitor (NP) cell population, also known as cap mesenchyme. These cells are capable of self-renewal, which is necessary to generate an appropriate number of nephrons during the course of embryogenesis and development. They are also multipotent, that is they have the ability to differentiate into the multiple cell types of the mature nephron7,8. More specifically, multipotent Cbp/P300-Interacting Transactivator 1 (CITED1)-positive/Sine Oculis Homeobox Homolog 2 (SIX2)-positive nephron progenitors give rise to multiple nephron segments, and are termed “self-renewing” nephron progenitors9. The transition of nephron progenitors into epithelialized structures is dictated by a series of tightly orchestrated signaling events. Of this, Bone Morphogenetic Protein 7 (BMP7) induces the initial exit of CITED1+/SIX2+ cells into a CITED1-/SIX2+ state, which marks nephron progenitors “primed” for differentiation by ureteric bud-derived Wnt family member 9b (WNT9B)/β-catenin signaling. Conversely, remaining CITED1+/SIX2+ nephron progenitors are kept in an undifferentiated and self-renewing state in response to Fibroblast growth factor 9 (FGF9), WNT and BMP7 signals10,11,12,13,14,15,16,17,18.

Upon WNT9B/β-catenin stimulation, nephron progenitors undergo a mesenchymal to epithelial transition to form pre-tubular aggregates, which then proceed to develop sequentially into polarized epithelial renal vesicles, comma- and then S-shaped body structures. Cells in the proximal portion of the S-shaped body differentiate into podocytes (glomerular development), while its mid- and distal portions give rise to tubular segments of the nephron, which are subdivided into proximal tubules, loops of Henle and distal tubules19, Fig. 1a. During the S-shaped stage of glomerular development, developing podocytes secrete vascular endothelial growth factor A (VEGFA), which attracts invading endothelial cells into the cleft of the S-shaped body. Platelet-derived growth factor-β (PDGFβ) signal produced by endothelial cells mediates the recruitment of mesangial cells, which invade the developing glomerulus and attach to the forming blood vessels. By the end of maturation, the glomerulus consists of four specified cell types: the fenestrated endothelium, mesangial cells, podocytes and parietal epithelial cells of the Bowman’s capsule20,21,22,23,24.

Figure 1
figure 1

Developing embryonic day 14.5 mouse kidney cell types. (a) Schematic illustration of nephron induction and patterning. In response to signals from the ureteric bud, the metanephric mesenchyme condenses and forms a cap of nephron progenitors (= cap mesenchyme) around the ureteric bud tips. Next, a sub-population of nephron progenitors undergoes a mesenchymal to epithelial transition to form pre-tubular aggregates (PTA), which develop sequentially into renal vesicles (RV), comma-shaped body (CSB) and S-shaped body (SSB). Endothelial cells are attracted into the cleft of the SSB. Color-coded map indicates the cell fate relationship of progenitor regions in SSB structure (upper right) and adult nephron structure (lower left). Schematic of a lateral view of the metanephric kidney depicting the cortical and medullary stroma (lower right). (b) tSNE plot showing the eleven cell clusters in the embryonic mouse kidney, with cell clusters corresponding to major components indicated by color. (c) Violin plots of gene expression for known lineage-associated genes (columns), stratified by cluster (rows). Our data clearly identifies cells from the major structural components of the developing kidney.

Single-cell RNA sequencing (scRNA-seq) technology offers the ability to comprehensively identify the transcriptional and (inferred) cellular composition of the developing kidney. Recent studies in developing mouse25,26,27,28,29,30 and human kidneys31,32,33 have contributed to our understanding of subpopulations of nephron progenitors and stromal cells, lineage fidelity, novel receptor-ligand pathways, and differences between mouse and human kidney development. However, there remains limited information regarding the developing kidney at mid-gestation, when the biological processes of nephron progenitor self-renewal, commitment to differentiation, and maturation of multiple cell types occur simultaneously. scRNA-seq also has the potential to inform improvements in our ability to culture nephron progenitor cells34,35,36 and produce higher-fidelity human kidney organoids37,38, and to develop novel strategies for enhancing renal repair and regeneration.

In this study, we used scRNA-seq to interrogate cell types and transcriptomes within 4,183 cells from one kidney pair of an E14.5 female mouse embryo at mid-gestation. Clustering identified eleven clusters corresponding to the major components/cell types of the developing kidney and revealed expression of known lineage markers in unexpected cell types (e.g., renal stromal markers in nephron progenitors). Pseudotime analysis was utilized to describe transcriptional dynamics as nephron progenitors differentiate. Notably, we find that cell cycle activity distinguishes between the “primed” and “self-renewing” sub-populations of nephron progenitors, with increased levels of the cell cycle related genes Birc5, Cdca3, Smc2 and Smc4 in the “primed” sub-population. Moreover, increased Birc5, Cks2, Ccnb1, Ccnd1 and Tuba1a/b expression was also observed in immature distal tubules, suggesting their involvement in the process of fusion between the distal nephron and the collecting duct epithelia.


Single-cell gene expression identifies anatomical structures and cell lineages in the developing kidney

New nephrons are induced in response to signals from the ureteric bud throughout nephrogenesis until approximately postnatal day 3 in mice39. Thus, we chose to perform scRNA-seq at embryonic day 14.5 (E14.5), a time point at which there is simultaneous nephron progenitor self-renewal, active nephron induction and maturation of multiple cell types, to comprehensively interrogate single-cell transcriptomes spanning different stages of differentiation during kidney development at mid-gestation. Using one kidney pair from an E14.5 female mouse embryo processed using the 10X Chromium platform and Illumina sequencing, our dataset consists of 4,183 high-quality kidney cells, with a median number of 2,789 genes detected per cell. Grouping cells into eleven clusters (see “Methods”) reveals major components/cell types of the developing kidney (Fig. 1a–c, Fig. S1 and Table S1). Clusters and key markers are consistent with prior single-cell analyses of the developing mouse kidney25,26,27,28,29. We observe clear separation of cells of the hematopoietic (Cd52, Fcer1g), ureteric bud/collecting duct (Calb1, Gata3), and endothelial (Emcn, Kdr) lineages from other cells of the developing kidney (nephron progenitors, mixed/differentiating cells, podocytes, tubular cells and stromal cells). Stromal lineages are marked by expression of Col1a1 and Meis1, while cells derived from the nephron progenitor lineage express established marker genes associated with progressive stages of nephron differentiation. Thus, Cited1 and Six2 identify nephron progenitors, Lhx1 and Pax8 mark mixed/differentiating cells, Fxyd2 and Hnf4a mark tubular cells, and podocytes are marked by Podxl and Nphs1.

Consistent with other reports, we identify three different stromal clusters: medullary stroma (Col1a1, Meis1 and Cldn11), cortical stroma (Col1a1, Meis1, Aldh1a2 and Dlk1) and mesangial stroma (Col1a1, Meis1, Dlk1, Postn)25,28,30. Analyses of in situ hybridization data at E14.5 from other reports, as well as GUDMAP (The GenitoUrinary Development Molecular Anatomy Project) and Eurexpress public resources facilitated identification and assignment of these clusters27,40,41,42.

Taken together, these results show that our scRNA-seq data successfully captured major cell types that are expected to be present in the developing kidney at E14.5, including progenitor cells and their derivatives as well as mature cell populations.

Stratification of cell types in the nephron progenitor lineage

Next, we focused on nephron progenitor cells and their descendant/derived cell types (mixed/differentiating, podocytes and tubular cells). Selecting those cell types yielded 1,727 cells for further analysis. We are able to clearly distinguish between proximal and distal tubular cells and podocytes, and pseudotime analysis allows us to assess the level of lineage commitment (Fig. 2a). Nephron progenitor cells (marked by Six2 and Cited1 expression) clearly separate into two sub-groups, “self-renewing” and “primed” (Fig. 2). Mixed/differentiating cells express transcription factors like Pax8 and Lhx1, which are associated with nephron development and encompass nephron progenitor cells differentiating into tubular cells and podocytes. We note heterogeneity in the mixed/differentiating cell cluster, which likely contains cells with different degrees of differentiation, like pre-tubular aggregate, renal vesicle, comma-, and S-shaped bodies. Pseudotime analysis on this subset of cells reconstructs three lineages: differentiation into podocytes, and into proximal and distal tubular cells (Fig. S2). This enabled us to distinguish between mature and immature podocytes and proximal/distal tubular cells (Fig. 2, see “Methods”). Overall, our data clearly shows the major differentiation trajectories of nephron progenitor cells. For these three lineages, podocytes are marked by increasing expression of Podxl and Nphs143, proximal tubular cells by Pdzk1 and Slc34a144, and distal tubular cells by Tmem52b29 and Shd45 (Fig. S3 shows a heatmap of genes with pronounced expression differences during nephron progenitor cell differentiation and Table S2 summarizes differentially expressed genes).

Figure 2
figure 2

Cell types of the nephron progenitor lineage. (a) shows a tSNE plot of NP-derived cells, with clusters corresponding to cell types indicated by colors. The prefix “i_” indicates immature cells, while “m_” indicates mature cells. (b) shows violin plots of gene expression for known lineage-associated genes (columns), stratified by cluster (rows). We observe two types of NP cells (“self-renewing” and “primed”) and clear separation of distal and proximal tubular cells and podocytes in our data.

Transcriptional dynamics across nephron progenitor cell differentiation

Cell cycle activity distinguishes between two different types of nephron progenitor cells

Comparing gene expression between “self-renewing” and “primed” NP cells yielded cell cycle as a main difference between the two types of nephron progenitor cells (Fig. 3, Table S3). We find that cell cycle-related genes like Birc5, Cdca3, Smc2 and Smc4 are up-regulated between “primed” and “self-renewing” nephron progenitor cells (FDR-adjusted p values < 1E-21 for each, Supplemental Table S3). Immunofluorescence analysis on kidney sections from E14.5 and P0 mice indeed corroborates these observations, showing increased expression of Birc5 in “primed” nephron progenitors, along with the pre-tubular aggregates/renal vesicles immediately derived from primed nephron progenitors, but negligible or absent expression in the “self-renewing” nephron progenitor cells (Fig. 3d sub-panels α, α′, β, β′ and panel 3e). Birc5 expression was observed highest in cells annotated to cell cycle S-phase (Fig. S4). These results corroborate previous findings that demonstrated that the committed nephron progenitor cells are more proliferative (= fast-cycling population) and more likely to differentiate than the slow-cycling, self-renewing NP population46. Next, comparing primed nephron progenitor cells with mixed/differentiating cells, we observe up-regulation of transcription factors associated with differentiation (Lhx1, Pax8, FDR-adjusted p values: 9.67E-15 and 1.59E-9, respectively), and down-regulation of nephron progenitor-associated genes like Cited1, Six2, Eya1, Crym, Meis2, Rspo1 and others (Figs. 2b, 3a, e, Table S4, FDR-adjusted p value < 1E-16 for each). In situ hybridization analysis confirms the expression of Rspo147 primarily in nephron progenitors (Fig. 3f sub-panels α, β). Gene Ontology enrichment analysis comparing self-renewing with primed nephron progenitors, and primed progenitors with differentiating cells highlights differences in cell cycle-related biological processes between the two types of nephron progenitor cells and processes associated with differentiation between primed nephron progenitors and differentiating cells (Fig. 3b, c and Table S5).

Figure 3
figure 3

Transcriptional signatures of self-renewing, primed and differentiating nephron progenitor cells. (a) shows differentially expressed genes on a heatmap of 100 random cells for each of the “self-renew”, “primed” and differentiating clusters, with key genes annotated on the right. (b) and (c) show the 20 most-enriched Gene Ontology terms for genes differentially expressed between self-renewing and primed NP cells, and between primed NP cells and differentiating cells, respectively. (d) Immunofluorescence on kidney sections from embryonic day 14.5 (E14.5) and postnatal day 0 (P0) mice using anti-BIRC5 (α - β′) and anti-Cyclin D1 (γ - δ’) antibodies (red). Nephron progenitors and their early epithelial derivatives were detected using an antibody against anti-Neural cell adhesion molecule (NCAM; green). Nuclei were counterstained with DAPI (blue). Scale bar, 25 μm. The sub-panels α′, β′, γ′ and δ′ are close-ups of the areas indicated by the white boxes. (e) Expression of Birc5, Ccnd1 and Rspo1 across pseudotime; colors indicate cell clusters. (f) In situ hybridization on cryosections of P0 kidneys confirms the expression of Rspo1 in nephron progenitors and their early epithelial derivatives (α). No signal was detected with sense probe hybridization (β). Images are representative of three independent experiments. Scale bar 25 μm. (g) Inferred regulatory module activity based on SCENIC50 across pseudotime for predominantly self-renewing and primed nephron progenitor cells.

To better understand these transcriptional changes occurring between self-renewing and primed nephron progenitor cells, we performed two additional types of gene set enrichment analyses utilizing lineage pseudotime annotation. Focusing on genes with changes across pseudotime (FWER < 0.01) we analyzed up-regulated and down-regulated genes separately. Performing enrichment analysis across Gene Ontology and Hallmark gene sets from MSigDB48,49 yielded “EPITHELIAL_MESENCHYMAL_TRANSITION” (FDR-adjusted p value: 7.1E-6) as the most enriched Hallmark gene set for the down-regulated genes, while gene sets enriched for up-regulated genes included “E2F_TARGETS” as the most enriched term as well as a multitude of gene sets associated with cell cycle/replication (see Table S6). We then used SCENIC50 to gain some insight into gene regulation driving the transcriptional changes we observe across pseudotime between the two nephron progenitor cell types. Figure 3g depicts the activity of inferred regulatory modules across pseudotime for 37 recovered transcription factors. We observe three regulatory modules of down-regulated genes, attributed to the transcription factors EGR1, MAF and FOS51,52,53. These findings corroborate previous studies showing that these transcription factors are critical regulators of gene expression, controlling transition from a pluripotent to differentiated state in nephron progenitor and human embryonic stem cells53. For up-regulated genes, we observe modules associated with cell cycle-related transcription factors like E2f8, Hcf1, Ezh2 and Mybl2, which have previously been implicated in specific aspects of cell cycle progression and cell fate decision in stem and progenitor cells54,55,56,57,58. Genes making up each module are provided in Table S8.

Since we identify cell cycle as a major difference between ‘self-renew’ and ‘primed’ NP cells, a pertinent question is whether cell cycle might perhaps explain differences between the two types of NP cells entirely. Therefore, we regressed out bioinformatically annotated cell cycle phase per cell prior to dimension reduction; however, we continue to observe separation between ‘self-renew’ and primed NP cells (Fig. S5, Tables S11 and S12). This finding indicates differences other than cell cycle contribute to differences between these two types of NP cells, and it is in line with our observation of up-regulation of transcriptions factors Pax8 and Lhx1 and down regulation of NP genes like Cited1, Crym, Gas1, Meis2, and Rspo1 in ‘primed’ NP cells (Fig. S5), all genes without direct connection to the cell cycle.

Birc5 expression in the tubular interconnection zone

The cell cycle-related genes Birc5, Cks2, Ccnb1, Ccnd1 and Tuba1a/b were up-regulated in immature distal tubules (Figs. 2b, 4a). Immunostaining analysis confirmed augmented expression of BIRC5 and cell cycle- related gene Cyclin D1 in the distal renal vesicle domain (Fig. 3d). Interestingly, increased Birc5 was also observed in a sub-set of ureteric bud cells (Fig. S6 and Fig. 3d) located in the region of interconnection between the late renal vesicle and the adjacent ureteric bud tips. The fusion between the nephron and the collecting system is required for the formation of a functional renal network. Studies in mouse models have demonstrated that this process is driven by preferential cell division within the distal renal vesicle domain59. Therefore, the expression of Birc5 marks the fusion between the nephron and collecting system, and might contribute to tubular interconnection by regulating proliferation in the late renal vesicle and cell survival in the adjacent ureteric tip cells.

Figure 4
figure 4

Transcriptional signatures of podocytes and tubular cells. (a) Violin plot of genes expressed in proximal and distal tubular cells (rows are clusters and columns denote genes). (b) In situ hybridization of P0 kidneys confirms expression of Neat1 in distal tubules. No signal was detected with sense probe hybridization. (c) Same as (a), but for podocytes.

Conserved features in mouse and human podocyte development

In the podocyte lineage, the genes most significantly defining the cluster are Pax8, Podxl and Nphs1. Podxl and Nphs1 (in combination with Synpo, Nphs2 and Vegf-a) are restricted to a sub-population of mature podocytes28,31,60, which is consistent with our observations (Figs. 2b and 4c). In a sub-population of early podocytes, Wt1 and Mafb expression has been reported to overlap with the immature marker Pax827,31,61, and is expressed in parietal epithelial cells62, also consistent with our findings (Figs. 2b and 4c). For a summary of gene expression changes during podocyte development see Fig. S7a and Table S9. Similar to previous scRNA-seq analysis in human fetal kidney31, we observe enrichment in the PDZ domain proteins Magi2, Slc9a3r2 and Pard3b in mature podocytes (Fig. 4c). We also observe podocyte-specific activity of Cldn5 (while the claudins Cldn6 and Cldn7 are expressed in tubular lineages, Fig. 4a). Further on, the gene Sparc (a cystine-rich matrix-associated protein) and the Tissue-Type Plasminogen Activator Plat are expressed specifically in the podocyte lineage (as is Robo2, a gene known to be expressed and colocalized with nephrin on the basal surface of mouse podocytes63), while the cell cycle regulator Gas1 (Growth Arrest Specific 1) is expressed in undifferentiated cells and mature podocytes, but less so in mature tubular cells (Fig. 4c). Together, these findings further define the gene expression profile of the podocyte lineage, and they suggest substantial conservation between mouse and human developing podocytes.

Gene expression differences between proximal and distal tubular cells

In addition to their respective marker genes Pdzk1, Slc34a1, Tmem52b and Shd (see above), we observe that tubular lineages express the claudins Cldn6 and Cldn7, as well as the Lymphocyte Antigen 6 Complex Locus A (Ly6a, aka Sca-1), see Fig. 4a (Fig. S5b and Table S10 show additional genes with differential expression between proximal and distal tubular cells). Ly6a is a member of the murine L6 family and has been reported to mark cancer and tissue-resident stem cells in mice64; however there is no known direct human ortholog for Ly6a, and also the function of the LU domain, which characterizes LY6A’s superfamily of proteins, is currently unknown in humans64.

We note that Mep1a, Aldob and Tmem174 mark proximal tubular cells in our data (Fig. 4a) and have been reported amongst the top-most down-regulated genes after p53 conditional deletion in nephron progenitor cells65. Of the other top three reported down-regulated genes two (Pck1 and Cyp2d26) also show proximal tubular cell specific expression (data not shown), while Reg8 expression was not detected in our data. This is in line with the observation of fewer proximal tubular cells in P0 mutant kidneys reported in65. With respect to the cell cycle, we find that Cyclin-Dependent Kinase Inhibitor 1 A (Cdkn1a, aka P21) is active specifically in proximal tubular cells, while Cyclin-Dependent Kinase Inhibitor 1 C (Cdnk1c, aka P57) is primarily expressed in podocytes (Fig. 4a and c). For distal tubular cells, we don’t observe a selectively active kinase inhibitor but note that Cyclin-Dependent Kinase-like 1 (Cdkl1) is specifically expressed in this cell type. However, we find the long non-coding RNA Neat1 specifically expressed in mature distal tubular cells (Fig. 4a, b and Fig. S8). These findings pinpoint lineage-specific gene expression differences between the proximal vs. distal tubular lineages, and they point towards lineage-specific control of the cell cycle across nephron progenitor differentiation.

Recently published scRNA-seq papers have described differences in gene expression across a variety of proximal tubule transcripts and lncRNAs in different sexes in the adult mouse kidney66,67. We observe that female-enriched markers, including Gm4450, Lrp2, Sultd1, Aadat, Hao2 were highly expressed in our proximal tubular cluster, while most of the male-enriched markers (Slc22a12, Cndp2, Cesf1, etc.) were absent or expressed at low levels. This data suggests that sexually dimorphic gene expression in proximal tubule may occur at or before E14.5.

Independent data validate key findings

We note that single-cell RNA sequencing inevitably introduces a certain level of noise, and therefore we explored our main findings using mid-gestation mouse kidney scRNA-seq data generated by Magella et al.27 (Fig. S9). We find that cell type labels for kidney cell types agree well between the two data sets in general, and specifically for NP and NP-derived cell types. We also find clear evidence in this complementary dataset for distinct types of NP cells, with Birc5 expression clearly distinguishing between different cell types. Therefore, we are able to validate our main observation regarding different types of NP cells on independent data.

Expression of known lineage marker genes in unexpected cell types

Expression of known lineage marker genes in unexpected cell types has been reported based on the analysis of scRNA-seq data, for example stromal cells express Gdnf27. Consistent with this report, we found that nephron progenitor markers (Six2, Cited1, Crym) are expressed in cells in the stromal cluster, and that stromal markers are present in the nephron progenitor cluster (Meis1, Foxd1, Crabp1). We also confirm that Gdnf is expressed in the stromal cluster (in addition to nephron progenitor cells), and that Aldh1a2 RNA is present in stromal and nephron progenitor clusters (Fig. 5).

Figure 5
figure 5

Expression of lineage-marker genes in unexpected cell types. Heatmap of gene expression (gray scale) of known lineage marker genes (rows) across cells (columns), ordered by cell clusters (color index). We observe the expression of cap mesenchyme markers (Cited1, Six2, Crym, Gdnf) in stromal cells and vice versa, consistent with previous reports27.

We note that nephron progenitor marker genes are not homogenously expressed across different stromal cell types. For instance, Cited1 is detected (five or more reads) in about 7% of cortical stromal cells, but in less than 1% of other stromal cell types. We find similar enrichment (expression in ~ 7% vs. less than 1% of cells) for Six2 and Crym in cortical stroma, whereas Gdnf is more modestly enriched in cortical stroma (expressed in ~ 3% vs. less than 1% of cells, respectively). We next focused on cortical stroma and looked at co-expression of nephron progenitor and stromal marker genes in the same cells (binary expression “on” vs. “off”) and find significant positive association between the expression of stromal- and nephron-progenitor genes (Fisher exact test, Table 1). This analysis demonstrates widespread co-expression of nephron-progenitor and stromal markers in the same cortical/stromal cells, and on average we observe higher odds ratios of association for Col1a1 expression with nephron progenitor lineage markers, compared with Meis1 (Table 1).

Table 1 Co-expression of stromal marker genes and nephron progenitor marker genes in cortical/stromal cells.

Further on, the cluster we identified as distal tubular cells contains cells with a distal-like expression profile, as characterized by the expression of Tmem52b and Shd29. However, despite the distinct lineage origins, cells from this cluster and from the "ureteric_bud/collecting_duct" cluster exhibit some transcriptional congruence29,68. Specifically, Calb1, Wdfc2 and Mal28, which are thought to mark the ureteric bud lineage, and Mecom67, which is thought to mark distal tubular cells, are expressed in a significant fraction of cells in both these clusters, but absent in proximal tubular cells (Table 2).

Table 2 Ureteric bud / collecting duct lineage genes are expressed in distal tubular cells. Shown is the percentage of cells expressing ureteric bud / collecting duct (ub/cd) lineage marker genes Calb1, Mal, Mecom and Wfdc2 for cells from immature and mature distal and proximal tubular clusters, and also from the ub/cd cluster.


Over 30 terminally differentiated nephron cell types are required for the function of the mammalian kidney. The advent of scRNA-seq technology has made it possible to explore the cellular heterogeneity of the kidney and precisely identify the transcriptional signatures that define each of its cell types. In this study, we have performed scRNA-seq analysis of the developing mouse kidney at E14.5, a time point in which there is active nephron induction and varying degrees of nephron maturation. Major transcriptional clusters—corresponding to nephron progenitors, mixed/differentiating cells, podocytes, differentiated tubules (proximal and distal), ureteric epithelium, stroma (medullary, mesangial and cortical), hematopoietic and endothelial lineages—are identified within the whole kidney analysis, and are consistent with prior single-cell analyses of the developing mouse kidney25,26,27,29. We find that cell cycle activity distinguishes between “primed” and “self-renewing” sub-populations of nephron progenitors. Furthermore, augmented expression of cell cycle-related genes Birc5, Cks2, Ccnb1, Ccnd1 and Tuba1a/b was also detected in immature distal tubules, suggesting cell cycle regulation may be required for early events of nephron patterning and tubular fusion between the distal nephron and the collecting duct epithelia.

All nephron segments derive from a multipotent self-renewing nephron progenitor population, which co-expresses the transcription factor Six2 and the transcriptional activator Cited1. Previous studies have identified two sub-types of nephron progenitors, with CITED1+/ SIX2+ progenitors transitioning to a CITED1-/ SIX2+ primed state as nephrogenesis proceeds15,34,69. Recent studies using time-lapse imaging and scRNA-seq analyses have indicated, however, that the nephron progenitor compartment is more heterogeneous than initially supposed26,27,29,70,71,72. Moreover, differences in cell cycle length within progenitors appear to play a role in the sub-compartmentalization of the progenitor population46. In agreement, our scRNA-seq analysis shows separation of nephron progenitor cells into a “self-renewing” and “primed” sub-population, both co-expressing Six2 and Cited1, but distinguished by higher cell cycle activity in the “primed” cells. Studies in mice have demonstrated that the committed nephron progenitors are more proliferative, exhibiting preferential exit from the cap mesenchyme compartment and differentiation into early nephrons46. Intriguingly, in the human renal cap mesenchyme, the “self-renewing” nephron progenitors exhibit a greater proliferative activity, compared to the committed progenitor population32. Although it is still unclear what drives these species-specific differences, this may be related to unique transcription factor expression in the human fetal kidney (such as continuous Six1 expression in cap mesenchyme throughout nephrogenesis)32.

We find that the transcriptional profile of “primed” nephron progenitors represents an intermediate/transitional state between self-renewing NP and mixed/differentiating cells (pre-tubular aggregates/renal vesicles), with lower levels of Cited1 and increased expression of early commitment markers like Lhx1 and markers of renal epithelia like Pax8 (see Fig. S10). These findings are consistent with previous scRNA-seq analyses of developing human and mouse kidneys28,73, but are in contrast to other studies on nephron progenitor subpopulations where Cited1 expression seems to be turned off prior to the activation of pre-tubular aggregate genes15,34,72. Such discrepancies might be due to differences in the technical sensitivity of the methods applied in each study (scRNA-seq versus immunofluorescence or in situ hybridization). They also highlight the importance of further analysis to confirm whether these nephron progenitor sub-populations coincide with distinct spatial domains within the developing kidney.

Our approach successfully identified a number of cell types in the developing kidney. Consistent with previously reported expression patterns, we observe Podxl , Synpo, Nphs1 and Nphs2 expression in mature podocytes, whereas Wt1 and Mafb are also expressed in a sub-population of early podocytes31,60. Indeed, we also observed several PDZ domain proteins (Magi2, Slc9a3r2 and Pard3b)31 expressed in human developing podocytes in our data, suggesting that podocyte identity is conserved in the mouse and human developing kidney. Cells in the proximal tubular cluster are characterized by the specific expression of known proximal tubule markers, such as Pdzk1 and Slc34a129,67. The scaffold protein PDZK1 is essential for the proper localization of interacting proteins, such as the sodium-phosphate transporter NaPi-Iia (encoded by Slc34a1), in the brush border of the proximal tubular cells74,75,76. Interestingly, mutations in Slc34a1 have been linked to nephrocalcinosis and Fanconi renotubular syndrome77,78. Further on, we observe that Cldn5 marks the podocyte lineage, while Cldn6 and Cldn7 are expressed in mixed/differentiating cells and both tubular lineages, but absent in podocytes. Genes specifically expressed in distal tubular cells in our data include Galectin 3 (Lgals3)79 and the long non-coding RNA Neat1.

The formation of a fully functional nephron entails fusion between the late renal vesicle and the adjacent ureteric tip. An elegant study using 3D modeling of nephrons and Six2-eGFPCre x R26R-lacZ mice demonstrated that this connecting segment of the nephron is derived from the cap mesenchyme (not the ureteric epithelium), and the process of fusion is likely driven by preferential cell division within the distal renal vesicle domain59. In line with this, our data identified augmented expression of cell cycle-related molecules, such as Birc5, Cks2, Ccnb1, Ccnd1 and Tuba1a/b, in immature distal tubules. Interestingly, high Birc5 expression was also detected in ureteric bud cells located in the region where the ureteric tip connects with the distal portion of the renal vesicle (see Fig. S6).

Birc5 (also known as Survivin) has been implicated in a number of kidney conditions, including autosomal-dominant polycystic kidney disease, acute kidney injury and renal cell carcinomas80,81,82,83,84, however its role in context of normal kidney development is still unknown. In normal tissues, transcription of Birc5 is tightly regulated in a cell cycle-dependent manner, reaching a peak in the G2/M phase85,86,87, followed by a rapid decline at the G1 phase88. BIRC5 targets the chromosomal passenger complex (CPC) to the centromere, ultimately enabling proper chromosome segregation and cytokinesis89,90,91,92,93,94,95. BIRC5 also plays a role as an inhibitor of programmed cell death. Although this mechanism is not completely understood, it seems to require cooperation with other molecules (such as XIAP and HBXIP) and results in inhibition of caspase-996,97,98,99. Our data suggest that BIRC5 might be a novel key molecule required for early events of nephron patterning and fusion, by regulating cell survival and/or proliferation in late renal vesicle and the adjacent ureteric tip.

In line with other scRNA-seq studies, we identify three stromal clusters in our dataset: cortical, medullary and mesangial25,27. The genes most significantly defining the mesangial stroma cluster are Dlk1 and Postn25,29,30. Cells in cortical stroma express high levels of Aldh1a2 and Dlk1, while medullary stroma cluster contains cells with increased expression of Cldn1129,30. The absence of an expression profile consistent with a loop of Henle signature in our scRNA-seq data is likely due to a low-abundance of these cell populations at E14.5100. In addition, the lack of information on the cell diversity and identity within the loops of Henle continues to hinder the annotation of this segment67.

In summary, this study provides an in-depth transcriptional profile of the developing mouse kidney at mid-gestation. Major transcriptional clusters are identified and are consistent with prior single-cell analyses of the developing mouse kidney25,26,27,29. Notably, we find that cell cycle activity distinguishes between the “primed” and “self-renewing” sub-populations of nephron progenitors, with increased levels of the cell cycle related genes Birc5, Cdca3, Smc2 and Smc4 in the “primed” sub-population. Finally, Birc5 expression in immature distal tubules and ureteric bud cells may contribute to early events of tubular fusion between the distal nephron and the collecting duct epithelia.


Embryonic kidney collection and single-cell RNA sequencing

Timed pregnant wild-type CD-1 female mice used in this study were obtained from Charles River Laboratories (Wilmington, MA, USA). The date on which the plug was observed was considered embryonic day 0.5 (E0.5). All experimental procedures were performed in accordance with the University of Pittsburgh Institutional Animal Care and Use Committee (IACUC) guidelines, which adheres to the NIH Guide for the Care and Use of Laboratory Animals. They were approved by the University of Pittsburgh IACUC, protocol #17,091,432.

We harvested two kidneys at E14.5 and generated a single cell suspension using 0.05% trypsin at 37 °C for 10 min. Kidneys were mechanically dissociated with pipetting at 5 and 10 min. 3% fetal calf serum in PBS was added to halt the trypsin. The cell suspension was filtered using a 40 μm filter and pelleted. The cells were resuspended in 90% FCS in DMSO and frozen, prior to shipment to GENEWIZ Ing. Single-cell library preparation and sequencing were performed by GENEWIZ Inc. using the 10X Genomics Inc. Chromium 3’ Single Cell v2 library preparation kit. Cells exhibited high viability after freezing and thawing (> 90%).

Data processing, quality control, and normalization

Alignment and read counting

Sequencing data was processed using the cellranger count pipeline of the Cell Ranger software (version 2.2.0) ( to perform alignments and yield barcode and UMI counts, such that the cell detection algorithms are bypassed and counts for 10,000 cells are returned (force-cells = 10,000 option). The mouse reference genome (GRCm38.p4) and transcript annotations from Ensembl (version 84) were used101.

Quality control

The Bioconductor102 R package DropletUtils103,104 was used to detect and remove empty droplets with default parameters at an FDR of 0.01, yielding a total of 5,887 non-empty droplets. Multiple quality control (QC) metrics were calculated using the R package scater105 and cells with at least 1000 detected features, and percentage of mitochondrial counts less than 3 times the median absolute deviation (MAD) from the median value were considered, resulting in a total of 4,402 cells. We excluded putative doublets as the top 5% cells ranked by the hybrid score from the R package scds106, further filtering out 220 droplets. Finally, only genes with at least three or more counts in at least three samples were considered, yielding a digital gene expression matrix comprising 11,155 genes in 4,183 cells/droplets.


We normalized the data using size factors calculated using the deconvolution method implemented in the computeSumFactors function in the R package scran107 after performing clustering using the quickcluster function on endogenous features with an average count ≥ 0.1, (min.mean = 0.1 option) yielding log-transformed normalized expression data. Feature selection and dimension reduction were performed using scran procedures. Briefly, we fit a mean–variance trend to the gene variances using the trendVar function and identified the biological component of the total variance with decomposeVar. All genes with an FDR < 0.01 and proportion of biological variance of at least 25% are considered as highly variable genes (HVG). Principal component analysis (PCA) was then performed using denoisePCA and two-dimensional representation was then derived using runTSNE.

Identification of major structural components of the kidney

Cells were grouped into clusters using the scran R package by building a shared k-nearest-neighbors graph using buildSNNGraph (with use.dimred = PCA and k = 25 options), followed by clustering with the Walktrap community finding algorithm as implemented in the igraph package (, cutting the graph at 10 clusters. We used the expression of a curated list of marker genes for major components of the developing kidney (see Fig. 1c) to assign cluster labels. Cluster-specific markers were derived using the findMarkers function (Table S1). We note that at this resolution tubular distal cells were grouped in the mixed/differentiating group; specific analysis of nephron progenitor descendant cell types then revealed distinct groups of distal vs. proximal tubular cells (see below).

Nephron progenitor and descendant cell types

Selecting and characterizing NP lineage cells

Focusing on nephron progenitor and descendant cell types (termed “nephron-progenitor”, “mixed/differentiating” (at that point containing “distal_tubular” cells as well), “podocytes and “proximal_tubular” in Fig. 1) and requiring expression of each gene with at least three counts in three cells yielded a gene expression matrix of 9,611 genes across 1,273 cells. Following the same procedure as before we derived a low-dimensional representation and identified six clusters of cells, corresponding to two types of nephron progenitor cells (“self-renew” and “primed”), “mixed/differentiating” cells as well as distal tubular cells and proximal tubular cells (Fig. 2). Cluster-specific marker genes were derived as before and are reported in Table S2. Enrichment analysis for Gene Ontology terms enriched between “self-renew” and “primed” and between “primed” and “mixed/differentiating” (Figs. 3b and c, Tables S3S5) were performed using the topGO function of the limma Bioconductor package108 with default parameters. We also used SAVER109 to impute gene expression values across this set of cells, which we then utilized in pseudotime-related analyses described below.

Pseudotime Analysis of NP cells

Pseudotime analysis was performed using slingshot110, using cluster labels and principal components derived as described above (via the clusterLabels and reducedDim options). This recovered three lineages (podocytes, distal-, and proximal tubular cells), with cells in “self-renew”, “primed” and early “differentiating/mixed” being shared (see Fig. S2).

Next, we fitted a multinomial log-linear model (using the nnet package111) relating pseudotime with the annotated clusters. For cells with more than one annotated lineage (in the “self-renew”, “primed” and early “differentiating” clusters) lineage pseudotimes from slingshot were averaged. This enabled us to define NP-cells as cells with annotated pseudotime less than the (pseudo)timepoint between “primed” and “differentiating” where the probability of the “primed” cluster has declined to 50% (i.e., 50% probability for “differentiating”). These cells contain all “self-renew” cells, 9 “differentiating” cells and all but 11 “primed” cells and were used in subsequent pseudotime analyses comparing self-renewing and differentiating cells.

We then applied generalized additive models, as implemented in the mgcv package112, to screen for pseudotime-associated genes (Fig. 3g) by modeling gene expression as a smooth function of pseudotime. Focusing on high-quality pseudotime-associated genes (FWER < 1% modeling significance, plus highly expressed and with an absolute spearman correlation of gene expression with pseudotime larger than 0.4) yielded 228 genes with overall decreasing expression across pseudotime (down-regulated), and 451 with increasing expression (up-regulated). Gene sets are reported in Table S6.

We then used MsigDB (v7.0)48 and hypergeometric tests to screen for annotated gene sets enriched for up- or down-regulated genes, focusing on Gene Ontology and Hallmark gene sets (Table S7). Finally we screened for regulatory modules in time-varying genes using SCENIC50, where we used default options including GENIE3113 for network inference (see Fig. 3, and Table S8 lists the modules we recovered).

Immunohistochemical staining

Kidneys dissected from embryonic day 14.5 (E14.5) and postnatal day 0 (P0) mice were fixed overnight in 4% paraformaldehyde, embedded in paraffin and sectioned at 4 μm. After deparaffinization, rehydration, and permeabilization in PBS-Tween (PBS-T), antigen retrieval was performed by boiling the slides in 10 mM sodium citrate pH 6.0 buffer for 30 min. Next, sections were blocked in 3% bovine serum albumin (BSA) and incubated overnight with antibodies recognizing BIRC5 (#2808, Cell Signaling Technology, Danvers, MA, USA), Cyclin D1 (#2978, Cell Signaling) and Neural cell adhesion molecule (C9672, Sigma-Aldrich, St. Louis, MO, USA) at the dilutions recommended by the manufacturers. On the next day, sections were washed with PBS-T, incubated with secondary antibodies at the dilution of 1:200, washed again with PBS-T, and mounted in Fluoro Gel with DABCO (Electron Microscopy Science, Hatfield, PA) before being visualized with a Leica DM2500 microscope and photographed with a Leica DFC 7000 T camera using LAS X software (Leica, Buffalo Grove, IL, USA). Goat anti-rabbit 594 (#111–515-047) and donkey anti-mouse 488 (#715–545-151) antibodies were purchased from Jackson InmmunoResearch Laboratories (West Grove, PA, USA).

In situ hybridization

Kidneys were harvested from P0 pups, fixed in 4% paraformaldehyde overnight, treated with 30% sucrose/PBS and embedded in Tissue-Tek Optimal Cutting Temperature Compound (OCT; Sakura, Torrance, CA, USA). In situ hybridization was conducted on 10 μm cryosections as described114. To generate sense and antisense probes, plasmids were linearized and transcribed as follows: pGEM®-T Easy-RSPO1-SacII/SP6 and pGEM®-T Easy-RSPO1-SaI/T7; pCMV-Sport6-Neat1-NotI/SP6 and pCMV-Sport6-Neat1-SalI/T7. pCMV-Sport6-Neat1 vector was purchased from Horizon Discovery (Clone Id: 4,504,324).