Kidney fibrosis is the hallmark of chronic kidney disease progression; however, at present no antifibrotic therapies exist1,2,3. The origin, functional heterogeneity and regulation of scar-forming cells that occur during human kidney fibrosis remain poorly understood1,2,4. Here, using single-cell RNA sequencing, we profiled the transcriptomes of cells from the proximal and non-proximal tubules of healthy and fibrotic human kidneys to map the entire human kidney. This analysis enabled us to map all matrix-producing cells at high resolution, and to identify distinct subpopulations of pericytes and fibroblasts as the main cellular sources of scar-forming myofibroblasts during human kidney fibrosis. We used genetic fate-tracing, time-course single-cell RNA sequencing and ATAC–seq (assay for transposase-accessible chromatin using sequencing) experiments in mice, and spatial transcriptomics in human kidney fibrosis, to shed light on the cellular origins and differentiation of human kidney myofibroblasts and their precursors at high resolution. Finally, we used this strategy to detect potential therapeutic targets, and identified NKD2 as a myofibroblast-specific target in human kidney fibrosis.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Processed data for all human and mouse RNA-seq and ATAC–seq libraries produced in this study are available at the Zenodo data archive (https://zenodo.org/record/4059315, https://doi.org/10.5281/zenodo.4059315). Processed and raw mouse data are available via the Gene Expression Omnibus (GEO) under the accessions GSE145173 (for mouse PDGFRαβ scRNA-seq and ATAC–seq) and GSE144528 (for mouse PDGFRβ Smart-Seq.). Source data are provided with this paper.
Custom scripts used in single cell and bulk RNA-seq data analysis are available at: https://github.com/mahmoudibrahim/KidneyMap. Scripts used for imaging in-situ hybridization data quantification are available at: https://gitlab.com/mklaus/segment_cells_register_marker.
Duffield, J. S. Cellular and molecular mechanisms in kidney fibrosis. J. Clin. Invest. 124, 2299–2306 (2014).
Kramann, R., DiRocco, D. P. & Humphreys, B. D. Understanding the origin, activation and regulation of matrix-producing myofibroblasts for treatment of fibrotic disease. J. Pathol. 231, 273–289 (2013).
Friedman, S. L., Sheppard, D., Duffield, J. S. & Violette, S. Therapy for fibrotic diseases: nearing the starting line. Sci. Transl. Med. 5, 167sr1 (2013).
Falke, L. L., Gholizadeh, S., Goldschmeding, R., Kok, R. J. & Nguyen, T. Q. Diverse origins of the myofibroblast—implications for kidney fibrosis. Nat. Rev. Nephrol. 11, 233–244 (2015).
Young, M. D. et al. Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors. Science 361, 594–599 (2018).
Kang, H. M. et al. Defective fatty acid oxidation in renal tubular epithelial cells has a key role in kidney fibrosis development. Nat. Med. 21, 37–46 (2015).
Naba, A. et al. The extracellular matrix: Tools and insights for the “omics” era. Matrix Biol. 49, 10–24 (2016).
Fan, Y. et al. Comparison of kidney transcriptomic profiles of early and advanced diabetic nephropathy reveals potential new mechanisms for disease progression. Diabetes 68, 2301–2314 (2019).
Kriz, W., Kaissling, B. & Le Hir, M. Epithelial-mesenchymal transition (EMT) in kidney fibrosis: fact or fantasy? J. Clin. Invest. 121, 468–474 (2011).
Huang, S. & Susztak, K. Epithelial plasticity versus EMT in kidney fibrosis. Trends Mol. Med. 22, 4–6 (2016).
Elices, M. J. et al. VCAM-1 on activated endothelium interacts with the leukocyte integrin VLA-4 at a site distinct from the VLA-4/fibronectin binding site. Cell 60, 577–584 (1990).
Kang, H. M. et al. Sox9-positive progenitor cells play a key role in renal tubule epithelial regeneration in mice. Cell Rep. 14, 861–871 (2016).
Ramachandran, P. et al. Resolving the fibrotic niche of human liver cirrhosis at single-cell level. Nature 575, 512–518 (2019).
Wang, Y.-Y. et al. Macrophage-to-myofibroblast transition contributes to interstitial fibrosis in chronic renal allograft injury. J. Am. Soc. Nephrol. 28, 2053–2067 (2017).
Henderson, N. C. et al. Targeting of αv integrin identifies a core molecular pathway that regulates fibrosis in several organs. Nat. Med. 19, 1617–1624 (2013).
Wernig, G. et al. Unifying mechanism for different fibrotic diseases. Proc. Natl Acad. Sci. USA 114, 4757–4762 (2017).
Venkatachalam, M. A., Weinberg, J. M., Kriz, W. & Bidani, A. K. Failed tubule recovery, AKI-CKD transition, and kidney disease progression. J. Am. Soc. Nephrol. 26, 1765–1776 (2015).
Kramann, R. et al. Parabiosis and single-cell RNA sequencing reveal a limited contribution of monocytes to myofibroblasts in kidney fibrosis. JCI Insight 3, e99561 (2018).
Gerstein, M. B. et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91–100 (2012).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Palumbo-Zerr, K. et al. Orphan nuclear receptor NR4A1 regulates transforming growth factor-β signaling and fibrosis. Nat. Med. 21, 150–158 (2015).
Schubert, M. et al. Perturbation-response genes reveal signaling footprints in cancer gene expression. Nat. Commun. 9, 20 (2018).
Zhao, S. et al. NKD2, a negative regulator of Wnt signaling, suppresses tumor growth and metastasis in osteosarcoma. Oncogene 34, 5069–5079 (2015).
Li, C. et al. Myristoylated Naked2 escorts transforming growth factor α to the basolateral plasma membrane of polarized epithelial cells. Proc. Natl Acad. Sci. USA 101, 5571–5576 (2004).
Moerman, T. et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics 35, 2159–2161 (2019).
Lemos, D. R. et al. Interleukin-1β activates a MYC-dependent metabolic switch in kidney stromal cells necessary for progressive tubulointerstitial fibrosis. J. Am. Soc. Nephrol. 29, 1690–1705 (2018).
Tsukui, T. et al. Collagen-producing lung cell atlas identifies multiple subsets with distinct localization and relevance to fibrosis. Nat. Commun. 11, 1920 (2020).
Adams, T. S. et al. Single-cell RNA-seq reveals ectopic and aberrant lung-resident cell populations in idiopathic pulmonary fibrosis. Sci. Adv. 6, eaba1983 (2020).
Kramann, R. et al. Perivascular Gli1+ progenitors are key contributors to injury-induced organ fibrosis. Cell Stem Cell 16, 51–66 (2015).
Takasoto, M. et al. Kidney organoids from human iPS cells contain multiple lineages and model human nephrogenesis. Nature 526, 564–568 (2015).
Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).
Srivastava, A., Malik, L., Smith, T., Sudbery, I. & Patro, R. Alevin efficiently estimates accurate gene abundances from dscRNA-seq data. Genome Biol. 20, 65 (2019).
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47 (D1), D766–D773 (2019).
Lun, A. T. L., McCarthy, D. J. & Marioni, J. C. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000 Res. 5, 2122 (2016).
Scrucca, L., Fop, M., Murphy, T. B. & Raftery, A. E. mclust 5: clustering, classification and density estimation using gaussian finite mixture models. R J. 8, 289–317 (2016).
Fortunato, S. & Barthélemy, M. Resolution limit in community detection. Proc. Natl Acad. Sci. USA 104, 36–41 (2007).
Zeisel, A. et al. Molecular architecture of the mouse nervous system. Cell 174, 999–1014.e22 (2018).
Lake, B. B. et al. A single-nucleus RNA-sequencing pipeline to decipher the molecular anatomy and pathophysiology of human kidneys. Nat. Commun. 10, 2832 (2019).
Clark, J. Z. et al. Representation and relative abundance of cell-type selective markers in whole-kidney RNA-Seq data. Kidney Int. 95, 787–796 (2019).
Wu, C. et al. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol. 10, R130 (2009).
Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21, 3439–3440 (2005).
Durinck, S., Spellman, P. T., Birney, E. & Huber, W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 4, 1184–1191 (2009).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).
Xu, C. & Su, Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31, 1974–1980 (2015).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Alter, O., Brown, P. O. & Botstein, D. Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl Acad. Sci. USA 97, 10101–10106 (2000).
Rosvall, M. & Bergstrom, C. T. Maps of random walks on complex networks reveal community structure. Proc. Natl Acad. Sci. USA 105, 1118–1123 (2008).
Dahlin, J. S. et al. A single-cell hematopoietic landscape resolves 8 lineage trajectories and defects in Kit mutant mice. Blood 131, e1–e11 (2018).
Ibrahim, M. M. & Kramann, R. genesorteR: feature ranking in clustered single cell data. Preprint at https://doi.org/10.1101/676379 (2019).
Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. 2008, P10008 (2008).
Lun, A. T. L., Bach, K. & Marioni, J. C. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 17, 75 (2016).
Rand, W. M. Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971).
Yang, L., Liu, J., Lu, Q., Riggs, A. D. & Wu, X. SAIC: an iterative clustering approach for analysis of single cell RNA-seq data. BMC Genomics 18 (Suppl. 6), 689 (2017).
Karaiskos, N. et al. The Drosophila embryo at single-cell transcriptome resolution. Science 358, 194–199 (2017).
McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).
Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).
The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Holland, C. H. et al. Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol. 21, 36 (2020).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Yang, J. et al. Single cell transcriptomics reveals unanticipated features of early hematopoietic precursors. Nucleic Acids Res. 45, 1281–1296 (2017).
Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).
Bushnell, B. BBMap: a Fast, Accurate, Splice-Aware Aligner https://www.osti.gov/biblio/1241166 (2014).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Quinlan, A. R. BEDTools: the Swiss-army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12.1–11.12.34 (2014).
Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010).
Ibrahim, M. M., Lacadie, S. A. & Ohler, U. JAMM: a peak finder for joint analysis of NGS replicates. Bioinformatics 31, 48–55 (2015).
Luehr, S., Hartmann, H. & Söding, J. The XXmotif web server for eXhaustive, weight matriX-based motif discovery in nucleotide sequences. Nucleic Acids Res. 40, W104–W109 (2012).
Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).
Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44 (W1), W160–W165 (2016).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Schacht, T., Oswald, M., Eils, R., Eichmüller, S. B. & König, R. Estimating the activity of transcription factors by the effect on their target genes. Bioinformatics 30, i401–i407 (2014).
Alvarez, M. J. et al. Functional characterization of somatic mutations in cancer using network-based inference of protein activity. Nat. Genet. 48, 838–847 (2016).
Garcia-Alonso, L., Holland, C. H., Ibrahim, M. M., Turei, D. & Saez-Rodriguez, J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res. 29, 1363–1375 (2019).
de Kanter, J. K., Lijnzaad, P., Candelli, T., Margaritis, T. & Holstege, F. C. P. CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing. Nucleic Acids Res. 47, e95 (2019).
Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. CellPhoneDB: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complexes. Nat. Protoc. 15, 1484–1506 (2020).
Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000 Res. 4, 1521 (2015).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Korotkevich, G., Sukhov, V. & Sergushichev, A. Fast gene set enrichment analysis. Preprint at https://doi.org/10.1101/060012 (2019).
This work was supported by grants of the German Research Foundation (DFG: SFBTRR57, P 25 and P30; SFBTRR219 C01 and C05, CRU344 KR 4073/9-1, P1 and SCHN 1188/5-1), by a Grant of the European Research Council (ERC-StG 677448), a Grant of the State of North Rhine-Westphalia (Return to NRW) and a Grant of the Interdisciplinary Centre for Clinical Research (IZKF) within the faculty of Medicine at the RWTH Aachen University (O3-11) all to RK and by a START grant of the RWTH Aachen University to MMI (691906/2018-1) and a grant of the German Society of Internal Medicine (DGIM) to CK. P.B. received funding from the DFG (SFB/TRR57, SFB/TRR219, BO3755/3-1, BO3755/9-1). This work was also supported by the BMBF eMed Consortia Fibromap (to V.G.P., R.K. and R.K.S.), iPRIME-EKFS to M.K., and DFG (CRC/1192 to T.B.H. and V.G.P.). J.R.S. was supported by a Wellcome Trust PhD fellowship (ref. 104366/Z/14/Z). N.C.H. is supported by a Wellcome Trust Senior Research Fellowship in Clinical Science (ref. 219542/Z/19/Z), Medical Research Council, Chan Zuckerberg Initiative Seed Network Grant, and British Heart Foundation grants (RM/17/3/33381; RE/18/5/34216). We thank the Eukaryotic Single-Cell Genomics facility at SciLifeLab Stockholm for generating Smart-Seq2 single cell sequencing libraries.
The authors declare no competing interests.
Peer review information Nature thanks Menna Clatworthy, Jeremy Duffield and Simon Mendez-Ferrer for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, A schematic of human nephrectomy kidneys. Kidney samples were sampled from the tumour-free kidney cortex distant from the tumour region. b, A schematic of the whole kidney sorting strategy. Single cell 10x Genomics RNA-seq libraries were prepared from CD10−, living (DAPI−) and CD10+, living (DAPI−) cells separately. CD10− cells are enriched for mesenchymal cells. c, Immunofluorescence staining of CD10, LTA, CD45 and WT1. CD10 expression labels proximal tubule epithelial cells. d, Representative flow cytometric plots from the sorting and gating strategy described in c enriching for CD10− cells. e, Relationship between serum creatinine and age in patients included in the scRNA-seq experiments of Fig. 1. f, Relationship between serum creatinine and degree of interstitial fibrosis scored by a blinded nephropathologist for the same patients in e. g, Relationship between serum creatinine and tubular atrophy as scored by a blinded nephropathologist for the same patients in e and f. h, Representative images of PAS stained (left) and collagen 3 immunostained kidneys of patients with eGFR > 60 ml min−1 per 1.73 m2 body surface area. i, As in h but patient with eGFR < 60 ml min−1 per 1.73 m2 body surface area. j, Each patient visualized in the UMAP of Fig. 1b. k, The main five cell types found in the CD10− fraction, illustrated on the same UMAP embedding from Fig. 1b. l–r. Expression of select marker genes visualized on the same UMAP embedding. s, Each cell state/type visualized in the UMAP of Fig. 1b. t, Doublet score (see Methods) for human CD10+ cells. u, Doublet score (see Methods) for human CD10− cells. Scale bars, 50 μm (c1–c3), 30 μm (c4’–c4’’), in 75 μm (h, i).
a, Scaled gene expression of marker genes in mesenchymal cell clusters of the CD10− data depicted in Fig. 1b–e. Each 100 cells are averaged in one column. b, As in a, but endothelial clusters. c, As in a, but immune clusters. d, As in a but epithelial clusters. e, The seven cell clusters found in the CD10+ fraction cells visualized on the UMAP embedding from Fig. 1e. f, UMAP embedding of e with colours representing the cell types/states. g, Scaled ECM score on UMAP from e. h, Eight patients visualized on UMAP embedding as in e. i, Expression of RBP4 visualized on UMAP embedding from e. j, Scaled gene expression of the top 20 genes by specificity of each of the 7 cell clusters discovered in the CD10+ data depicted in e–h. k, log-transformed fold change of cell cycle stage assignment frequencies in healthy and CKD epithelial cells relative to permuted frequencies. Positive numbers represent enrichment; negative numbers represent depletion. l, Percentage of cells per cell cluster in each cell cycle phase as predicted from gene expression data. m, n, KEGG and GEO process terms enriched in cells belonging to healthy controls or patients with CKD, according to differentially expressed genes between healthy and diseased patients (see Fig. 1f). Note fatty acid catabolic process and lipid oxidation consistent with KEGG pathway enrichment results o, ECM, collagen, proteoglycan and glycoprotein score of human diabetic kidney dataset8. Advanced diabetic nephropathy (DN) n = 21, early DN n = 6, control n = 9. p, Distribution of single-cell ECM scores for all cells in the CD10− cell fraction, colours indicate cell groups obtained by unsupervised mixture model clustering of ECM scores. q, ECM groups visualized on the UMAP of Fig. 1b. r, ECM scores as in p, but scaled and visualized on the UMAP embedding from Fig. 1b. s–u, Scaled expression of gene groups summarized in the ECM score including collagens (r), glycoproteins (s) and proteoglycans (t). All 50 cell clusters are shown, all cells from each cluster are averaged in one column.
a, UMAP embedding of fibroblast, pericyte and myofibroblast cells from 13 human kidneys (n = 2,689). Colours represent the cell types. Lines refer to a lineage trajectory predicted by slingshot (Methods). b, Expression of selected genes on the embedding of a. c, Gene Ontology biological process analysis for pericyte, myofibroblast, fibroblast cell clusters and vascular smooth muscle cells based on the top marker genes for each cluster (CD10− data; see Methods). d, ECM score and scaled expression of select genes visualized on the mesenchymal cell diffusion map embedding of Fig. 1o. e–h, The distribution of ECM score, collagen score, glycoprotein score and proteoglycan scores stratified by epithelial cell clusters in the CD10− cell fraction. i, Scaled expression of select genes in proximal tubules and injured proximal tubule cell clusters. Each 100 cells are averaged in one column. j, GO biological process analysis based on differential expression between proximal tubules and injured proximal tubules. k–n, The distribution of ECM score, collagen score, glycoprotein score and proteoglycan scores for epithelial cells (CD10+ cell fraction). o, Percentage of cells expressing PDGFRβ and COL1α1 in each main cell niche. Neuronal Schwann cells were excluded because they are represented by a small number of cells.
a, Patient samples (n = 8) visualized on the UMAP from Fig. 2a. Different cell clusters are indicated by different colours. Stratification of single cells according to patient clinical parameters. b, c, Expression of select genes on the same UMAP embedding from a. d, Scaled gene expression of the top 10 genes in each cell type/state cluster. Gene ranking per cell cluster was determined by genesorteR. e, Correlation between cell clusters identified in CD10− data (Fig. 1, columns) and PDGFRβ+ data (Fig. 2, rows). f, ECM score stratified by four main cell types in PDGFRβ+ data. g, ECM score stratified by main mesenchymal cell types. h, ECM score stratified by five epithelial cells clusters. i, ECM score visualized on the UMAP embedding from a. j, Doublet score (Methods) for human PDGFRβ+ cells. k, Representative image of combined immunofluorescent and multiplex RNA ISH of LTA (proximal tubular marker), COL1A1 and PDGFRB. Note COL1A1 and PDGFRB expression in LTA+ tubular cells (j, arrows). l, Representative image of combined immunofluorescent and multiplex RNA ISH of CD68 (macrophage marker), COL1A1 and PDGFRB. m, Representative image of multiplex RNA ISH of PECAM1, COL1A1 and PDGFRB. Scale bars, 50 μm.
a, The mesenchymal cell clusters in Fig. 2 here indicated on the diffusion map embedding from Fig. 2c (left) and stratified by eGFR class (right) and the expression of selected genes on the same embedding. b, UMAP embedding of mesenchymal cell populations from Fig. 2a. Colours represent the cell types/states shown in Fig. 2a. c, ECM score visualized on the UMAP in b. UMAP embedding indicates distinct and separate pericyte and fibroblast origins for myofibroblasts, consistent with diffusion map embedding of the same cells (Fig. 2e). d, Three main pericyte and (myo)fibroblast cell types indicated on the same UMAP embedding. e, Pseudotime as predicted by the Slingshot algorithm on the same UMAP embedding from b. f, g, COL1α1 and NOTCH3 expression on the UMAP embedding from b. h, i, Violin plots across mesenchymal cells types of COL1A1 and POSTN of human PDGFRβ+ dataset in Fig. 2. j, Quantification of MEG− (NOTCH3−POSTN−) cells in human kidneys (n = 35) (see patient data in Supplementary Table 2). n = 17 (healthy), 10 (early) and 8 (late); *P < 0.05, **P < 0.01 by one-way ANOVA followed by Bonferroni’s correction. Tukey box whisker plot. k–m, Representative image of multiplex RNA ISH of MEG3, NOTCH3 and POSTN. Note that triple-positive cells (arrow with tails) or double-positive cells (NOTCH3+POSTN+, l magnification 2, arrowheads) can be detected in the kidney interstitium. n, Immunofluorescence staining of CXCL12 (SDF-1). Note expression in the kidney interstitium in PDGFRβ+ cells (arrow with tails) and LTA− tubular cells (arrows). o–q, Representative image of multiplex RNA ISH of CCL19, CCL21 and PDGFRB. r, Quantification of CCL19+CCL21+ cells in human kidneys (n = 35) (patient data in Supplementary Table 2). n = 17 (healthy), 10 (early) and 8 (late); ***P < 0.001, ****P < 0.0001 by one-way ANOVA followed by Bonferroni’s post hoc test. Tukey box and whisker plot. For details on statistics and reproducibility, see Methods.
a, KEGG pathway enrichment analysis along pseudotime for lineage 1 (see Fig. 2c.) b, Top, gene expression dynamics along pseudotime for lineage 2 (fibroblasts to myofibroblasts, see Fig. 2c). Cells (in columns) were ordered along pseudotime and genes (in rows) that correlate with pseudotime were selected and plotted along pseudotime (Methods). Each 10 cells were averaged in one column. Genes were grouped signifying their pseudotime expression pattern. Selected example genes for each group are indicated. See Supplementary Data 3 for gene cluster assignments. Bottom, cell cycle stage along pseudotime as percentage of each 500 cells along pseudotime. c, As in b but for lineage 3 cells (see Fig. 2c). d, PID signalling pathway enrichment analysis along pseudotime for lineage 2 cells ordered along pseudotime as in b. e, KEGG pathway enrichment analysis along pseudotime for lineage 2. f, As in d. but for lineage 3 cells (see Fig. 2c). g, As in e but for lineage 3 cells. h–k, Violin plots across mesenchymal cells types of selected genes of the human PDGFRβ+ dataset in Fig. 2. l, Transcription factor scores for proximal promoter regions (l) and distal regions (m) obtained by transcription factors equence motif enrichment analysis for top marker genes for the mesenchymal cell clusters of the human PDGFRβ+ dataset (Methods). Note enrichment of Fos and Jun motifs in promoters of fibroblast marker genes. m, Schematic of human kidney PDGFRβ+ cell generation and immortalization. n, Cell proliferation (WST-1) and expression of FOS, COL1A1, POSTN and OGN in immortalized human PDGFRβ kidney cells by RNA qPCR after treatment with the AP-1 inhibitor T-5224 and/or TGFβ. n = 3 per group. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, one-way ANOVA followed by Bonferroni’ post hoc test. Data are mean ± s.d. o, Expression of OGN (Fib1+3) and POSTN (MF1) visualized on the same UMAP embedding from Extended Data Fig. 5b. p, AP-1 average transcription factor expression (left) and average expression of putative AP-1-regulated genes (right) against collagen scores stratified by fibroblast and myofibroblast cells. Notably, the expression of AP-1 anti-correlates with collagen score but the expression of its target genes positively correlates with collagen score, potentially pointing towards an inhibitory role for AP-1. q, The number of statistically significant receptor–ligand interactions between mesenchymal cells and all other cell types (CD10− fraction; Fig. 1) according to CellphoneDB Analysis. Dendritic cells, monocytes, myofibroblasts, podocytes, arteriolar endothelial cells and injured tubules as the main sources of signalling ligands to pericytes fibroblasts and myofibroblasts. r, Dot plot for significant ligand–receptor interactions from the selected signalling pathways EGFR, PDGF, WNT, TGFβ, Notch and Hedgehog for pericytes, fibroblast and myofibroblasts. Interacting ligand–receptor and cell types are shown by pairs. The first cell type of the interacting pair expresses the ligand and the second cell type expresses the receptor (that is, first and second proteins from the interaction, respectively). Ligand–receptor interactions are grouped by signalling pathways. Yellow: EGFR, pink: PDGF, green: WNT, red: TGFβ, black: Notch, blue (light or dark): mixed of TGFβ and EGFR. None of the hedgehog interactions were significant. For details on statistics and reproducibility, see Methods.
a, Representative image of COL1α1 in-situ hybridization in a Pdgfrb-creER;tdTomato kidney after UUO surgery. Scale bar, 10 μm. b, c, Quantification of αSMA+ cells in PDGFRβtdtom+ kidneys from UUO day 10. n = 3. Data are mean ± s.d. d, Scaled expression of the top 10 genes by specificity in each cell cluster depicted in Fig. 3e. All cells from each cell cluster are averaged in one column. e, Expression of select genes in all 10 cell clusters from Fig. 3e. f, ECM score visualized on the same UMAP embedding from Fig. 3e. g, Distribution of ECM score, collagen score, glycoprotein score and proteoglycan score per cell cluster. h, Immuno-fluorescence staining in sham and UUO (day 10) mouse kidney showing PDGFRα expression in a subset of Pdgfrb-creER;tdTomato positive cells (arrows). i, RNA ISH showing co-localization of Col1a1 expression in PDGFRα PDGFRβ double-positive cells. COL1α1 PDFGRα PDGFRβ triple-positive cells (arrows) occur solely in the kidney interstitium. j, Left, COL1α1 expression and ECM score in CD10− cells (Fig. 1b) stratified according to PDGFRα and PDGFRβ expression. Right, percentage of COL1α1+ and COL1α1− cells in the same data, stratified in the same way. COL1α1− cells occur mostly in PDGFRα PDGFRβ double-negative cells whereas COL1α1+ cells occur predominantly in PDGFRα PDGFRβ double-positive cells (n = 51,849). Group comparisons: (other genes) vs (a/b): P = 2.618674 × 10−317, (a) vs (a/b): P = 5.277528 × 10−53, (b) vs (a/b): P = 1.243014 × 10−233, (other genes) vs (a): P = 1.531528 × 10−177, (b) vs (a): P = 6.344322 × 10−64, (other genes) vs. (b): P = 8.89697 × 10−152. Bonferroni corrected P values based on a two-sided Wilcoxon rank sum test. k, Distribution of immunofluorescence/TA score over 62 patients and representative image of a trichrome stained human kidney tissue microarray (TMA) stained by multiplex RNA ISH using PDGFRA, PDGFRB and COL1A1 probes with nuclear counterstain (DAPI) of 62 kidneys (patient data in Supplementary Table 2) (left), average scaled COL1α1 expression in the ISH data stratified by PDGFRα PDGFRdetection in the same data (middle) and percentage of COL1α1-positive and -negative cells in the same data stratified in the same way (right). Group comparison: (α/β) vs (COL1α1): P ≈ 0, (α/β) vs (β): P ≈ 0, (α−) vs (α): P ≈ 0. Bonferroni corrected P values based on a two-sided Wilcoxon rank sum test. l–p, A diffusion map embedding of pericytes and matrix-producing cells with annotation of the different time points in m, cell cluster annotation in n and scaled expression of selected genes in o–q. q, The surgery type per cell (sham versus UUO) visualized on the same UMAP embedding from Fig. 4c (top), or with colours representing the cell types/states (bottom). r, Expression of select genes on the same UMAP embedding from Fig. 3j. s, ECM and collagens score distribution for the four main cell types (top) and for mesenchymal clusters (bottom). Scale bars, 50 μm (b, h, i), 10 μm (k). For details on statistics and reproducibility, see Methods.
a, Scaled expression of the top 10 genes by specificity in each of the mesenchymal cell clusters depicted in Fig. 3d. Each 100 cells are averaged in one column. b, Cell doublet score (Methods) of mouse PDGFRα+PDGFRβ+ dataset per cell cluster. c, Violin plot of COL15α1 expression per cell cluster. Only mesenchymal cells are shown. Bonferroni corrected P values based on a two-sided Wilcoxon rank sum test in Supplementary Data 4. d, UMAP embedding of MEG3 as in Fig. 2a and multiplex in-situ staining of MEG3 on human kidney tissue. Scale bars, 30 μm (d1), 40 μm (d2+3). e, Representative image of multiplex RNA ISH for PDGFRA, PDGFRB and MEG3 in n = 34 human kidneys (patient data in Supplementary Table 2). MEG3 colocalizes with PDGFRA and PDGFRB. Scale bar, 10 μm. f, Percentage of MEG3− cells out of PDGFRα+PDGFRβ+ double-positive cells, quantified from RNA ISH. n = 34. Tukey box and whisker plot. g, Expression of select genes on the UMAP embedding from Fig. 3j. For details on statistics and reproducibility, see Methods.
Extended Data Fig. 9 Correlation of human and mouse populations and distinct gene-regulatory programs of the mesenchyme.
a, Classification tree of human PDGFRβ dataset derived by the CHETAH algorithm based on single-cell expression and clustering information. b, Supervised classification of mouse PDGFRα+PDGFRβ+ cells using human PDGFRβ+ cells as a reference (see classification tree in a). Heat map displays percentage of mouse PDGFRα+PDGFRβ+ cells in each mouse cell cluster. Fibroblasts 1 in mice are largely classified as fibroblast 1 according to human data. Mouse myofibroblasts are classified as node 15 and myofibroblast 2b in humans indicating variability between mouse and human with myofibroblast states. c, Schematic of proposed cellular origin of fibrosis. d, Scaled gene expression of transcription factors discovered by ATAC–seq (see Fig. 3o) in six fibroblasts and myofibroblast cell populations. e, ATAC–seq signal for motif matches inside open chromatin regions for five selected transcription factors. f, Genome browser snapshots for select genes. ATAC–seq signal and motif matches in open chromatin regions are shown. Multiz Align is conservation scores between mouse and human, ClinVar lift is clinical variants lift to mouse genome coordinates. Nrf, Irf8, Elf1, Ets1 and Klf1 motifs are located in promoter and enhancer open chromatin regions of myofibroblast associated genes such as Col1a1, Col15a1, Tgfb and Nkd2. Creb5_Atf3 is found in genes associated to Fib1 cluster, such as Tmeff2. g, Expression of some of the genes investigated in g–i. Visualized on the UMAP embedding from Fig. 4c. i, Scaled expression of genes that correlate or anti-correlate with injury time across matrix producing cells (mouse PDGFRβ+ data). Note the expression of Ogn, Scara5 and Pcolce2 is largely specific to day 0–2 cells, whereas the expression of Nkd2, Fbn2 and Nkd1 is specifically increased at day 10 after UUO. h, Signalling pathway enrichment in the mesenchymal cell clusters depicted in Fig. 3o.
a, b, GO biological process terms for genes that correlate or anti-correlate with NKD2+ expression across single cells in pericytes fibroblasts and myofibroblasts in mouse PDGFRα+ PDGFRβ+ data (a) and human PDGFRβ+ data (b). Genes correlated with NKD2+ expression are related to ECM expression, integrin signalling and focal adhesion. c, Pathway activity as estimated by the PROGENy algorithm in NKD2+ versus NKD2− cells from the human PDGFRβ+ dataset. P > 0.05 n.s., *P < 0.05, **P < 0.01, ***P < 0.001, P values were adjusted for multiple testing using Benjamin and Hochberg FDR method. d, Scaled gene expression of top 100 genes whose expression is correlated or anti-correlated with NKD2 expression across single cells in human PDGFRβ+ data (see b). e, Gene regulatory network predicted based on the expression of cells and genes depicted in l. using the GRNBoost2+ algorithm. Connection between genes indicate putative direct or indirect regulatory interactions. Colours indicate clustering of the gene regulatory network using the Louvain algorithm and highlights the regulatory network of ECM expression (module 2, NKD2+) and fibroblast and pericyte maintenance (module 4 and 3) f, Module 2 from l. Depicted separately, connections of NKD2 are highlighted in red. g, Expression of genes highlighted in e and f, including Etv1 transcription factor and LAMP5 which are both directly connected to Nkd2 in e and f. h, Expression of COL1A1, FN1 and ACTA2 by qPCR after NKD2 overexpression in human immortalized PDGFRβ+ cells treated with TGFβ or vehicle (PBS). n = 3 per group. P values determined by one-way ANOVA followed by Bonferroni’s post hoc test. Data are mean ± s.d. i, Expression of NKD2 by RNA qPCR in NKD2-knockout cells. ****P < 0.0001, one-way ANOVA followed by Bonferroni’s post hoc test. Data are mean ± s.d. j, Expression of COL1A1, FN1 and ACTA2 by RNA qPCR after NKD2 knockout in the same clones depicted in h. n = 3 per group. #P < 0.05, ##P < 0.01, ###P < 0.001, ####P < 0.0001 (versus control NTG); ****P < 0.0001 (versus TGFβ NTG), two-way ANOVA followed by Sidak’s post hoc test. Data are mean ± s.d. k, PID signalling pathways enriched in PDGFRβ+ NKD2-KO clones and overexpression (up indicates upregulated genes in indicated condition, and down indicates down regulated genes). l, GO biological process terms enriched in PDGFRβ+ NKD2-KO clones (up indicates upregulated genes in KO condition, and down indicates down regulated genes). m, Scaled gene expression of WNT pathway receptors and ligands in NKD2-perturbed human kidney PDGFRβ+ cells. *P < 0.05, **P < 0.01, ***P < 0.001, empirical Bayes from the test for differential expression after adjusting P values for multiple testing correction (Benjamini and Hochberg). n, Representative image of multiplex RNA ISH of PDGFRα, PDGFRβ and NKD2 in human iPS cell-derived kidney organoids. o, Immunofluorescence staining of human iPS cell-derived kidney organoids (day 7 + 18, 2D + 3D culture, respectively). LTA and HNF4α mark proximal tubular like-cells. Pan-CK (cytokeratin) marks epithelial-like cells. ERG (ETS regulated-gene) marks endothelial-like cells. DACH1 and NEPHS1 mark podocyte-like cells. COL1α1 marks fibroblast/myofibroblasts. p, Immunofluorescence staining of COL1α1 in IL-1β-treated kidney organoids. Scale bars, 40 μm (n, o) 50 μm (p). For details on statistics and reproducibility, see Methods.
This file contains Supplementary Figs 1-2 (FACS gating strategy and the uncropped blots), and Supplementary Tables 1-3.
Cell Cluster Names and Cell Numbers.
Gene Expression (as % of detected cells) and Specificity in Cell Clusters.
Gene clusters along psuedotime and in Nkd2 regulatory network analysis, related to human PDGFRB+ data (Figure 2, Figure 4, Extended Data Figs 6 and 10).
P-value tables related to comparisons in Extended Data Figs 7 and 8.
Gene expression source data matrix for gene expression along myofibroblast differentiation lineages, related to human PDGFRB+ data (Fig. 2, Extended Data Fig. 6).
About this article
Cite this article
Kuppe, C., Ibrahim, M.M., Kranz, J. et al. Decoding myofibroblast origins in human kidney fibrosis. Nature 589, 281–286 (2021). https://doi.org/10.1038/s41586-020-2941-1
Stem Cell Research & Therapy (2021)
Nature Reviews Nephrology (2021)
Scientific Reports (2021)
Nature Communications (2021)
Nature Reviews Nephrology (2021)