Background & Summary

Hair follicle, as one of the skin appendages, is the most tractable model to study appendage development1. Hair follicle development involves complex interactions between the epidermis and underlying mesenchyme and is produced from a series of specific sites in the ectoderm and the underlying mesoderm2. During hair follicle development, the dynamic morphological changes have been extensively explored3,4,5. The hair follicle development of mice has been histologically categorized into three unique stages: induction (E13.5 - E14.5), organogenesis (E15.5–17.5), and cytodifferentiation (E18.5 onwards) in utero5, the molecular and cellular events of those morphological stages have been well characterized. In the induction stage, dermal first signal initiates hair follicle development, placode (Pc) and dermal condensates (DC) structures are gradually formed. In the organogenesis stage, keratinocyte proliferation leads to the formation of hair germ, further down-growth progresses to the peg stage. In cytodifferentiation stage, the most-proximally located keratinocytes begin to enwrap the dermal papilla (DP), followed by the bulbous peg stage and distinct strata of epithelial differentiation within hair follicle become morphologically noticeable5. Identifying the internal and external signaling mechanisms of hair follicle morphogenesis are the key for understanding the dynamic epithelial-mesenchymal interactions during the complex tissue development6. A multicellular organism comprises diverse cell types which is highly specialized to carry out unique functions7. The establishment of different cell lineage for development relies on specific spatiotemporal gene expression programs8 and gene regulatory networks (GRNs)9. Transcription factors bind to enhancers and promoters to regulate target gene expression, ultimately resulting in a cell type-specific transcriptome10,11,12. Single-cell technologies provide new opportunities to study the mechanisms underlying cell identity. Single cell RNA sequencing (scRNA-seq) was recently adopted for deciphering hair follicle heterogeneity across cell sub-populations, distinguishing fine molecular differences between individual cells, and describing the transcription atlas of mouse skin hair follicle1,13.

Single-cell ATAC sequencing (scATAC-seq), serving as a read-out of chromatin accessibility9, is a powerful tool to interrogate the epigenetic heterogeneity of cells and reveal cell type-specific transcriptional regulatory network14. Recent technical advancements in scATAC-seq have made it possible to simultaneously analyze the open chromatin regions of tens of thousands of cells and list the active DNA regulatory elements profile of the chromatin states such as cis- and trans-regulatory elements15. These open chromatin regions play important regulatory roles in distinguishing the cell types from complex organisms7. scATAC-seq has an essential role in depicting the trajectories of cell differentiation15,16, elucidating the transcriptional regulators of developmental lineages17, revealing the complex patterns of gene regulatory relationships for maintaining cellular state and developmental processes18.

To systematically investigate the cellular complexity of developing embryonic skin and gain the comprehensive insights into the molecular identity of hair follicle progenitors and niche cells, the nuclei from single-cell suspensions of E13.5, E16.5 and P0 mice dorsal skin were obtained using 10x Genomics ChromiumTM Controller &Accessory Kit. Then single-cell libraries were constructed and performed 10x Genomics single-cell ATAC sequencing on a droplet-based commercial platform. The raw data were obtained and subsequent data analysis was processed with Signac19. The datasets here provided the single-cell epigenomic profiling of hair follicle cells from the skin at different stages of mouse embryonic development. Our work would provide a suitable reference and basis for future single-cell chromatin studies, enriching the spectrum of cellular heterogeneity with hair follicle development and the dynamic morphological changes, serving as a valuable resource to understand how the system changes during hair follicle morphogenesis.

Materials and Methods

Ethics statement

All experimental protocols were approved by the Experimental Animal Manage Committee of Northwest A&F University (2011-31101684).

Isolation of mononuclear cells from mice skins

The dorsal skin tissues were collected from the pregnant mice on E13.5, E16.5 and postnatal day 0 (P0). Initially, the dissected skins were incubated with TrypLETM Express (TE, 1X) (Gibco) at 37 °C for 30 min, then separated the epidermis and dermis under a stereoscope (Motic). Thereafter, the epidermis was digested with TE for 15 min, while the dermis was digested with 2 mg/mL collagenase type II (Sigma, St Louis, MO, USA) for 15 min. Then the cells were centrifuged and resuspended in phosphate-buffered solution (PBS) containing 0.04% bovine serum albumin (BSA). Eventually, the cell suspensions were filtered through a 40-μm mesh and completed the preparation of single-cell suspension.

Nuclei isolation and scATAC-seq library preparation

The concentration of cell suspension was counted using a hemocytometer (TC20, Bio-Rad, Hercules, CA, USA) immediately, and the cell membrane was destroyed by surfactant, then the nuclear suspension was prepared. The nuclei concentrations were measured and adjusted to the desired capture number. The single nuclear barcoding and library preparation were performed following the 10x Chromium Single Cell ATAC Library & Gel Bead Kit (16 rxns PN-1000110) and sequenced on the Illumina NovaSeq. 6000 (Illumina, San Diego, CA, US) platform. Finally, 8016, 7714 and 7896 single nuclear samples from E13.5, E16.5 and P0 stages were sequenced, respectively.

Raw data processing

Preliminary sequencing data was transformed into FASTQ format using Cell Ranger ATAC (version 1.2.0, https://cf.10xgenomics.com/releases/cell-atac/cellranger-atac-1.2.0.tar.gz) by 10x Genomics standard sequencing protocol. Then the FASTQ files were aligned to mouse genome reference sequence mm10 (GRCm38.p6) using cell ranger ATAC count. Subsequently, we applied Cell Ranger for preliminary data analysis and generated a file that contained barcoded BAMs, peaks.bed, fragments tsv.gz, per barcode cell calling etc. Eventually, the output files (pre-process data) were used for the downstream visualization analysis.

Bioinformatic analysis of scATAC-seq data

Quality control (QC) filtering

R (version 3.6.1, https://www.r-project.org/) and Signac R packages (version 1.0.0, https://github.com/timoast/signac/)20,21 were used to perform downstream analysis. We identified barcodes representing genuine cells mainly by TSS enrichment score and the number of unique fragments. The filter metrics were determined by referencing the Signac official tutorial and previous studies (https://satijalab.org/signac/)20. The criterion was as follows: (1) the peak region fragment was >3000 and <10000 unique fragments; (2) enrichment at transcription start sites (TSS) ≥2; (3) pct reads in peaks ≥15; (4) blacklist ratio ≤0.025; (5) nucleosome signal <10 were filtered. And the outliers for those QC metrics were removed.

Normalization and Linear dimensional reduction

After QC, the high quality scATAC-seq datasets were obtained, then were normalized by term frequency-inverse document frequency (TF-IDF) and Seurat function “Run TFIDF”. The dimensionality was reduced from the DNA accessibility assay by latent semantic indexing (LSI), while the first LSI component was usually be removed from downstream analysis for capturing sequencing depth rather than biological variation.

Non-linear dimension reduction and clustering

After linear dimensional reduction, the cells were embedded in a low-dimensional space, performed graph-based clustering and non-linear dimension reduction for visualization, and applied the UMAP algorithm to visualize and identify cell clusters by Seurat function of “RunUMAP” and “FindClusters”.

Generating a counts matrix and cell-type annotation of scATAC-seq clusters

To define the specific highly expressed gene set of each cluster, we generated a count matrix and calculated the genescore value by the Signac function “GeneActivity ()”. The activity of each gene was quantified by evaluating the chromatin accessibility associated with the gene in the scATAC-seq data. A gene activity matrix was generated from the reads mapped to gene body and promoter (upstream 2 kb from the TSS), and calculated the genescore value of each gene. In order to facilitate cluster annotation, the gene activity of TopFeatures was examined and visualized genescore by “DotPlot”. Finally, the “gene activity” of some typical cell type-specific marker genes were visualized for clustering and cell type assignment of scATAC-seq data.

Data Records

We present chromatin accessibility landscapes of different cell types of mice skins, as a reference to deeply explore the epigenetic regulation mechanism of cell heterogeneity. Our data set on skins consists of chromatin accessibility landscapes for 6928, 6961 and 7374 high-quality cellular (single nuclear), respectively. According to the developmental characteristics of hair follicles at different stages and gene activity of scATAC-seq, we assigned biological identities to 6, 8 and 5 populations based on the gene activity of known marker genes. Figure 1 provides an overview of laboratory and bioinformatical workflow.

Fig. 1
figure 1

Workflow of mice skin scATAC-seq.

All skin scATAC-seq data have been uploaded to the NCBI Gene expression Omnibus (GEO) database with accession number GSE20121322. The raw data of the three samples have been deposited in NCBI Sequence Read Archive (SRA) and are accessible through the accession numbers: SRX1495148423, SRX1495148524 and SRX1495148625.

Technical Validation

All mice dorsal skins used in this study were freshly collected, dissected and digested into single cells (Methods). Each sequencing samples were from three independent individuals mixed in equal proportion according to the same cell count. Increasing biological duplication ensured the reliability of scATAC-seq data.

After sequencing the three libraries on an Illumina NovaSeq. 6000 and processing the raw sequencing data with Cell Ranger ATAC v1.2.0, pre-process data were analyzed with Seurat and Signac. For E13.5, E16.5 and P0 pre-process data, we detected 8,016, 7,714 and 7,896 cells, and obtained a median number of 19,906, 27,044 and 24,006 fragments per cell. All the libraries achieved a high overlapping rate for fragments of 69.3%, 69.9% and 74.9% (>55%) (Table 1). The Q30 index was beyond the QC low-border, indicating that high-quality mapping data were generated for the downstream analysis.

Table 1 Overview of the mapping parameters for the 10x Genomics scATAC-seq datasets established in mice skins.

We further used Signac to filter low-quality data, in which the TSS enrichment and unique fragment from each cell were calculated (Supplementary Fig. 1, available at Figshare26). Hence, we computed the nucleosome banding pattern, the total number of fragments in peaks, the fraction of fragments in peaks, ratio reads in ‘blacklist’ sites and transcriptional start site (TSS) enrichment score in each sample and removed the cells with the peak region fragment was >3000 and <10000 unique fragments, enrichment at TSS ≥2, pct reads in peaks ≥15, blacklist ratio ≤0.025, nucleosome signal <10.

After QC, 6,928, 6,961 and 7,374 high-quality nuclei were further analyzed, and the cell clustering was visualized by UMAP. The sample of E13.5 formed 8 indistinct clusters, E16.5 formed 12 separated clusters and P0 formed 13 clusters. Differential gene activity between the clusters was identified. The 20 top differential gene activity per cluster could be found in Supplementary Table 1 (available at Figshare26) for E13.5 clusters, Supplementary Table 2 (available at Figshare26) for E16.5 clusters, and Supplementary Table 3 (available at Figshare26) for P0 clusters, respectively.

Differential gene activity in the pre- and early post-implantation mammalian embryo resulted in the expression of certain parts of the genotypic potential to create a phenotypic form27. From the literature, we expected to find several dermal and epidermis cells during hair follicle development, such as epidermal keratinocytes, dermal fibroblasts, neural crest-derived melanocytes, schwann cells, etc6.

We focused on the “gene activity” of the cluster and the marker gene of different cell-type to validate that the established dataset was indeed represented a hair follicle population. In E13.5, we found that clusters 0, 1 and 2 mainly expressed fibroblast markers of Twist228 and Col1a129, clusters 3 and 8 expressed keratinocytes markers of Krt1430 and Krt1531, cluster 4 expressed macrophages markers of Cd8632 and Inpp5d33, cluster 5 expressed schwann markers of Sox56 and Sox1034, cluster 6 expressed blood vessels markers of Pecam135 and Kdr36,37, cluster 7 expressed muscle markers of Pax730 and Cdh1538. In E16.5, we detected that clusters 0, 2 and 6 expressed fibroblasts markers of Twist2 and Col1a1, cluster 1 expressed keratinocytes markers of Krt14 and Krt15, cluster 3 and 9 expressed blood vessels markers of Kdr and Flt436, cluster 4 expressed lymphocytes markers of Cpa339 and Ccr840, cluster 5 expressed macrophages markers of Cd86 and F13a141, cluster 7 expressed muscle markers of Myod142 and Myog43, cluster 8 expressed schwann markers of Gpr17 and Lims26, cluster 10 expressed melanocytes markers of Tyr6 and Dct44, cluster 11 expressed melanocytes markers of Ctsd6 and Lamp145. In P0, we detected that clusters 0, 2, 3, 7 and 9 mainly expressed fibroblasts markers of Twist2 and Col1a1, clusters 1, 4, 5, 6 and 8 expressed keratinocytes markers of Krt14 and Krt15, cluster 10 expressed melanocyte markers of Pax36 and Plp146, cluster 11 expressed blood vessels markers of Kdr and Cdh530, whereas the cells of cluster 12 expressed pericytes markers of Ebf230 and Rgs547. According to the gene activity of marker genes, the cells were classified into 6, 8 and 5 populations, respectively. The 6 populations included fibroblasts, keratinocytes, macrophages, schwann, blood vessels and muscle (Fig. 2a); the 8 populations were fibroblasts, keratinocytes, blood vessels, lymphocytes, macrophages, muscle, schwann and melanocytes (Fig. 2b); and the 5 populations were fibroblasts, keratinocytes, melanocytes, blood vessels and pericytes (Fig. 2c). The specific markers of these different cell types were shown in Fig. 3. GO term analysis was performed on the identified top 20 differentially gene activity (Fig. 4). The result showed that fibroblasts enriched in the signaling pathways including skeletal system development and embryonic morphogenesis; keratinocytes in the signaling pathways including skin development, keratinization, skin epidermis development and hair follicle development; Other cell clusters were also enriched in the corresponding development and differentiation related pathways (Fig. 4). These results further explained the rationality of clustering.

Fig. 2
figure 2

Clustering and UMAP visualization of scATAC-seq data in E13.5 (a), E16.5 (b) and P0 (c) mice skins.

Fig. 3
figure 3

A paired dot plot of scaled expression of selected marker genes for cell type identification in E13.5 (a), E16.5 (b) and P0 (c). The dot size encodes the proportion of cells that express the gene, while the color encodes the scaled average expression level across those cells (dark blue is high).

Fig. 4
figure 4

GO enrichment analysis of top20 differentially gene activity in E13.5 (a), E16.5 (b) and P0 (c).

Additionally, the percentage of fibroblasts, keratinocytes and other cell types were counted (Fig. 5). It was found that the percentage of fibroblasts was gradually decreased (E13.5: 63.5%; E16.5: 57.8%; P0: 47.8%) and the percentage of keratinocytes was gradually increased (E13.5: 16.8%; E16.5: 17.5%; P0: 43.6%). This result was consistent with the development of hair follicles, dermal fibroblasts migrated directly to form DC structure48,49, dermal condensate cells as the precursors of dermal papilla/dermal sheath niche cells within the mature follicle50. Progenitor cell migrated and then formed the physically identifiable Pc51, placode cells as the earliest progenitors of all epithelial hair follicle cells including adult stem cells (SCs) were in the bulge52. Pc progenitors signal backed to the dermis for the formation of DC6,53. The formation of Pc and DC was the beginning of hair follicle development54. Then, DC structure formed dermal papilla (DP) through DC1 and DC255,56, which enwrapped with the most-proximally located keratinocytes. Finally, keratinocyte proliferation led to the formation of hair germ, further down-growth progresses to the peg stage and the HF became morphologically noticeable5.

Fig. 5
figure 5

Percentage of specific cell type in E13.5, E16.5 and P0.

Finally, we integrated the fibroblasts and keratinocytes cluster in different developmental stages (Fig. 6), and performed further analysis. Chromatin accessibility analysis identified the differential accessibility regions (DARs) of fibroblasts (Supplementary Table 4, available at Figshare26) and keratinocytes (Supplementary Table 5, available at Figshare26) among E13.5, E16.5 and P0. Aligning to the reference genome, the DARs were annotated to promoter, intron, exon, 3′ UTR, etc. In the annotation information of fibroblasts (Supplementary Table 6, available at Figshare26) and keratinocytes (Supplementary Table S7, available at Figshare26), we focused on the regions which were annotated by the fibroblasts (Col1a1 and Twist2) and keratinocytes (Krt14 and Krt15) markers. We identified different DARs in the fibroblast cells and keratinocytes cells. The chr11-94912196-94914313 region was annotated to the distal intergenic region of Col1a1 (Fig. 7a), and chr1-91737593–91738868, chr1-91848615-91866615 were annotated to the distal intergenic region of Twist2 (Fig. 7b) in fibroblast cells. Meanwhile, the chr11-100218634-100219710, chr11-100221566-100222419, and chr11-100199847-100210887 were annotated to the distal intergenic region of Krt14 (Fig. 7c), and chr11-100127801-100129561 were annotated to the UTR region of Krt15 (Fig. 7d) in keratinocyte cells. It is generally believed that the accessibility of promoter regions is related to gene expression, the differential peaks in the promoter of Krt14 and Krt15 may play an important role in regulating gene expression. Meanwhile, we performed motif enrichment analysis on DARs at different stages of fibroblasts and keratinocytes, the enrichment results were shown in Supplementary Tables 8 and 9 (available at Figshare26). We found the stage differential peaks in fibroblasts were significantly enriched in Twist2, Junb and Nfatc1 and Lef1, respectively (Fig. 8a). These transcription factors were related to the development of dermal fibroblasts6. The stage differential peaks of keratinocytes were significantly enriched in Lhx2, Lef1 and Sox9, respectively (Fig. 8b). Lhx2 is a transcription factor positioned downstream of signals necessary to specify hair follicle stem cells57. Sox9 was an important transcription factor in dermal fibroblasts6. This result provided a basis for explaining reciprocal epithelial-mesenchymal signaling and was essential for the morphogenesis of mouse dorsal skin58.

Fig. 6
figure 6

UMAP visualization of fibroblasts and keratinocytes integrated from E13.5, E16.5 and P0. (a) fibroblasts, (b) keratinocytes.

Fig. 7
figure 7

Chromatin accessibility of fibroblasts and keratinocytes markers at E13.5, E16.5 and P0. Chromatin accessibility of fibroblasts markers Col1a1 (a) and Twist2 (b) in fibroblasts. Chromatin accessibility of keratinocytes markers Krt14 (c) and Krt15 (d) in keratinocytes.

Fig. 8
figure 8

Motif Enrichment in differential peaks between E13.5, E16.5 and P0. (a) fibroblasts, (b) keratinocytes.

To better understand the heterogeneity of fibroblasts and keratinocytes clusters which were integrated from different developmental stages, we subclustered the fibroblasts and keratinocytes into 12 clusters, respectively. In fibroblast cells, we found that clusters 0 and 5 expressed fibroblast progenitor markers of Zfhx4, Zfhx3 and Wnt116,59, respectively, cluster 1 expressed papillary fibroblast markers of Scel and Sgcg30, clusters 2 expressed papillary fibroblast markers of Adamts5 and Zbtb2030, cluster 3 expressed reticular fibroblast markers of Pdpn and Xdh30, cluster 4 expressed dividing fibroblast markers of Kif4 and Kif1430, cluster 6 expressed reticular fibroblast markers of Thbs1 and Thbs230, cluster 7 expressed pre-DC markers of Pdgfra and Hes156, cluster 8 expressed DP markers of Sox18 and Sox26, cluster 9 expressed DC markers of Vdr and Gas656, cluster 10 expressed Fascia markers of Mfap530, cluster 11 expressed DC markers of Sox9 and Fgf1056 (Fig. 9a). In keratinocyte cells, we detected that clusters 0 expressed hair germ markers of Shh and Lef136, cluster 1 expressed pre-Pc markers of Wnt10b60 and Wnt9b36, cluster 2 expressed outer root sheath (ORS) markers of Krt5 and Krt1636, cluster 3 expressed Pc markers of Dkk436, cluster 4 and 8 expressed hair follicle stem cells (HFSCs) and their precursor markers of Tgfb2, Adamts17, Fbn2 and Adamts2036, cluster 5 expressed Pc markers of Wif136, cluster 6 expressed HFSCs markers Sox9 and Nfatc163, cluster 7 and 10 expressed epithelial cell markers of Krt14, Smoc2, Pvrl4 and Ovol16, cluster 9 expressed keratinized cells markers of Krt1 and Krt1036, cluster 11 expressed terminally differentiated cells markers of Krt80 and Ly6d36 (Fig. 9b). According to the gene activity of marker genes, the fibroblast cells were classified into 8 populations, including fibroblast progenitor, papillary fibroblast, reticular fibroblast, DC, pre-DC, DP, Fascia and dividing fibroblast (Fig. 9c). The keratinocyte cells were classified into 9 populations, including Pc, hair germ, pre-Pc, ORS, HFSCs and their precursors, keratinized cells, epithelial cell and terminally differentiated cells (Fig. 9d). The differential gene activity of different fibroblasts and keratinocytes subtypes were shown in Supplementary Table 10 (available at Figshare26) and Supplementary Table 11 (available at Figshare26). The differential peaks of different fibroblast and keratinocyte subtypes were shown in Supplementary Table 12 (available at Figshare26) and Supplementary Table 13 (available at Figshare26). Many studies have found that promoter accessibility is positively correlated with gene expression, and the strongest correlation may be related to the function of housekeeping genes61. However, recent studies have found that there is a weak or no correlation between promoter accessibility and the transcription level of some genes62. The mechanism of this regulation process needs to be further revealed.

Fig. 9
figure 9

Subcluster fibroblasts and keratinocytes by gene activity. (a) Dot plots showing gene activity of marker genes for fibroblasts subtypes. (b) Dot plots showing the expression of marker genes for keratinocyte subtypes. (c) UMAP plots showing single-cell chromatin accessibility analyzed in fibroblasts. (d) UMAP plots showing single-cell chromatin accessibility analyzed in keratinocytes.

Taken together, our datasets provided a valuable resource for deeply exploring the epigenetic regulation mechanism of cell heterogeneity.