The mammary gland is composed of basal cells and luminal cells. It is generally believed that the mammary gland arises from embryonic multipotent progenitors, but it remains unclear when lineage restriction occurs and what mechanisms are responsible for the switch from multipotency to unipotency during its morphogenesis. Here, we perform multicolour lineage tracing and assess the fate of single progenitors, and demonstrate the existence of a developmental switch from multipotency to unipotency during embryonic mammary gland development. Molecular profiling and single cell RNA-seq revealed that embryonic multipotent progenitors express a unique hybrid basal and luminal signature and the factors associated with the different lineages. Sustained p63 expression in embryonic multipotent progenitors promotes unipotent basal cell fate and was sufficient to reprogram adult luminal cells into basal cells by promoting an intermediate hybrid multipotent-like state. Altogether, this study identifies the timing and the mechanisms mediating early lineage segregation of multipotent progenitors during mammary gland development.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The authors acknowledge the animal house facility from ULB (Erasme campus). Sequencing was performed at the Brussels Interuniversity Genomics High Throughput core (www.brightcore.be) and the Genomics Core Leuven. The authors thank N. Dedoncker for help with single-cell RNA-seq library construction. C.B. is an investigator with WELBIO, A.W. is supported by a FNRS fellowship. M.F. is supported by a Télévie fellowship. A.V.K. is Maître de Recherches of the FNRS. A.S., D.B. and T.V. are supported by KU Leuven (SymBioSys, PFV/10/016), Stichting Tegen Kanker (2015-143) and FWO (postdoctoral fellow number 12W7318N, [PEGASUS]² Marie Skłodowska-Curie fellow number 12O5617N). The authors thank colleagues who provided reagents mentioned in the text, and J.-M. Vanderwinden for help with confocal imaging. This work was supported by the FNRS, a research grant from the Fondation Contre le Cancer, the ULB fondation, the Fond Gaston Ithier, the Télévie, the foundation Bettencourt Schueller, the foundation Baillet Latour, and the European Research Council (EXPAND).
Integrated supplementary information
a-g, Unicellular suspension of skin and mammary bud cells from Lgr5-IRES-GFP E14 embryos stained for Lin (CD31, CD45, CD140a) in APC and CD49f in PE were gated as shown in a to eliminate debris, doublets were discarded with gate shown in b followed by gate showed in c, the living cells were gated by DAPI dye exclusion as shown in d, the non-epithelial Lin positive cells were discarded in e. The CD49f Hi cells were gated as shown in f and the GFP + cells were gated as shown in g. h-o, Unicellular suspension of mammary cells from adult K8rtTA/TetOCre/ΔNp63-IRES-GFP, induced at P30 and analyzed at P45, stained for Lin (CD31, CD45, CD140a) in PE, CD24 in PECy7 and CD29 in APC, were gated as shown in h to eliminate debris, doublets were discarded with gates shown in i followed by gate shown in j, the living cells were gated by DAPI dye exclusion as shown in k, the non-epithelial Lin positive cells were discarded in L and the GFP + cells were gated as shown in m. CD24 and CD29 expression was studied in Lin- cells (n) or in YFP + cells (o). The CD24 + CD29Lo gate corresponds to luminal cells (LC), while CD24 + CD29Hi gate corresponds to basal cells (BC). The stromal population corresponds to the cells labelled due to the leakiness of the Tet-O-Cre, as described previously in reference 14.
Supplementary Figure 2 Transcriptional profiling of EMPs reveals their hybrid basal and luminal gene expression signature.
a, Graph of enrichment score of the top functional annotation clusters for genes overrepresented in EMPs compared to LCs. Their ranking is shown in parentheses. b, Graph representing mRNA expression measured by microarray analysis of upregulated genes in FACS-isolated BCs and Lgr5 cells (fold over LC), showing the genes of the axon guidance cluster enriched in EMPs and BCs. c, d, Gene ontology (GO) analysis of genes upregulated > 1.5-fold in both LCs and Lgr5 cells compared to BCs. Histogram represents enrichment score (c) and Benjamini corrected p-value (in log10 base) (d) of the top functional annotation clusters for genes overrepresented in EMPs compared to BCs. Their ranking is shown in parentheses. e, Graph representing mRNA expression measured by microarray analysis of upregulated genes in FACS-isolated LCs and Lgr5 cells (fold over BC), showing the enrichment of cell cycle related genes in EMPs and LCs. a,c,d, are derived from the list of genes upregulated in the comparison of the mean of n = 3 independent microarrays samples for Lgr5 and the mean of n = 2 independent microarrays samples for LCs and BCs. b,e, are representing the fold change of the mean of n = 3 independent microarrays for Lgr5 and the mean of n = 2 independent microarrays for LCs and BCs.
a, Unsupervised clustering using SC3 of LC (n = 73) and BC (n = 45) using clustering parameters k = 2. Heatmaps of the top 15 marker genes for each cluster and their corresponding normalized expression are displayed (AUC > 0.8 and Wilcoxon signed rank test FDR adjusted p-value < 0.01). Columns represent single cells, colour-coded by their respective lineage. UND (undetermined significance, n = 7) represents few FACS isolated CD29HiCD24 + with LC gene signature. b, c, Gene ontology analysis of EMP scRNA-seq. Histogram representing enrichment score (b) and Benjamini corrected p-value (in log10 base) (c) of the top functional annotation clusters for genes overrepresented in EMPs (n = 68). Their ranking is shown in parentheses. d, scRNAseq Sox10 expression represented on PCA plot (n = 193 cells): PCA was performed on the top 500 most variable genes in the scRNAseq data, every dot represents a single cell. Colouring represents the normalized expression of Sox10.
Marker genes are obtained from scRNAseq data using SC3 on the adult cell lineages (n = 118) with k = 2 and filtering marker genes with AUC > 0.8 and Wilcoxon signed rank test FDR adjusted p-value < 0.01, and only showing genes which are expressed in more than 50% of EMPs and 50% of either BCs or LCs. The heatmap colouring represents the proportion of cells with > 0 expression for that gene for each cell type (LC, BC, EMP; n = 73, n = 45, n = 68 respectively).
Analysis of scRNAseq data of cells with > 2000 genes detected (n = 261) and including 1 row of the sorting plate with aberrant transcriptional profiles. a, Unsupervised clustering using SC3 on EMPs (n = 87), adult BC (n = 76) and LCs (n = 98) using clustering parameters k = 4. Heatmaps of the top 15 marker genes for each cluster and their corresponding normalized expression are displayed (AUC > 0.8 and Wilcoxon signed rank test FDR adjusted p-value < 0.01). Columns represent single cells, colour-coded by their respective lineage. b,c, Dimensionality reduction using t-Distributed Stochastic Neighbor Embedding (b) and Principal Component Analysis (c), every dot (n = 261) represents one cell with the colour representing either cell-type or the assigned SC3 cluster represented in a respectively. d, Scatter plot with the X-axis representing the proportion of BC-specific marker genes detected by SC3 (n = 53 cells) and the Y-axis LC-specific marker genes (n = 47 cells). Marker genes were selected to be expressed in at least 75% of the respective cell type and in less than 25% of the opposite cell type. The proportion of expressed markers is computed as the fraction of markers with > 0 expression over the total number of markers. Every dot (n = 261) represents one cell and are colour-coded according to cell type. Aberrant BC cells with low number of genes detected and aberrant LC/BC cells stemming from 1 row in the plate show a pseudo-hybrid signature and don’t cluster with their respective cell types.
a, Heatmap showing scRNAseq expression within Lgr5-GFP + E14 subclusters including Lgr5 + EMP (n = 68) and stromal mesenchymal cells (n = 11). Marker genes are obtained from the scRNAseq data comparing LC (n = 73) to BC cells (n = 45) using SC3 with clustering parameter k = 2 and selecting marker genes with an AUC > 0.8 and a Wilcoxon signed rank test FDR adjusted p-value < 0.01. Expression of epithelial markers Krt14, Krt5 and Krt8 is also shown. The heatmap colouring represents the normalized expression in the scRNAseq where rows represent the top 27 marker genes for each cluster and columns represent EMP cells. b-e, Scatter plots depicting the correlation between the proportion of BC-specific (n = 53 cells) (b) and LC specific markers detected (n = 47 cells)(d) and total number of genes detected before (c) and after correction (e). Each dot (n = 193) represents a single cell, and colour represents cell type. Correction was performed using a robust linear model for each cell type using the rlm function in R, depicted for each population as a straight line with 95% confidence interval.
a-d, Immunofluorescence analysis K8, K14 and p63 (a), K8, K14 and GFP (b, c) and K14, GFP and Foxa1 (d) in WT (a) and in K8rtta/TetOCre/DNp63-IRES-GFP mice 2 weeks following the expression of p63-IRES-GFP in LCs (b-d) (6 mice analysed). Arrowheads point to hybrid GFP + cells, coexpressing luminal and basal markers. Arrow points to GFP + K14 + K8- cell at the basal membrane. e, Venn diagram showing the important and statistically significant overlap between the genes upregulated by 2 fold in BCs compared to LCs (adult basal signature) and the genes upregulated by p63 in LCs (p63 LC signature). The gene lists were derived by comparing the means of RNAseq data (n = 2 for p63, WT LC and WT BC). Enrichment p value was calculated using the hypergeometric test performed with R software without adjustment, to test if these 2 data sets of 802 and 2860 genes have a significantly higher overlap (295 genes) than 2 data sets of the same size chosen randomly.