Single-cell transcriptomics defines keratinocyte differentiation in avian scutate scales

The growth of skin appendages, such as hair, feathers and scales, depends on terminal differentiation of epidermal keratinocytes. Here, we investigated keratinocyte differentiation in avian scutate scales. Cells were isolated from the skin on the legs of 1-day old chicks and subjected to single-cell transcriptomics. We identified two distinct populations of differentiated keratinocytes. The first population was characterized by mRNAs encoding cysteine-rich keratins and corneous beta-proteins (CBPs), also known as beta-keratins, of the scale type, indicating that these cells form hard scales. The second population of differentiated keratinocytes contained mRNAs encoding cysteine-poor keratins and keratinocyte-type CBPs, suggesting that these cells form the soft interscale epidermis. We raised an antibody against keratin 9-like cysteine-rich 2 (KRT9LC2), which is encoded by an mRNA enriched in the first keratinocyte population. Immunostaining confirmed expression of KRT9LC2 in the suprabasal epidermal layers of scutate scales but not in interscale epidermis. Keratinocyte differentiation in chicken leg skin resembled that in human skin with regard to the transcriptional upregulation of epidermal differentiation complex genes and genes involved in lipid metabolism and transport. In conclusion, this study defines gene expression programs that build scutate scales and interscale epidermis of birds and reveals evolutionarily conserved keratinocyte differentiation genes.


Results
Keratin KRT9LC2 is a marker of differentiated keratinocytes in scutate scales of chickens. The chicken has a diversified set of keratins 29 which are hypothesized to mark specific states of epithelial cell differentiation. Keratin 9-like cysteine-rich 2 (KRT9LC2), also referred to as Hard Acid Sauropsid-specific 2 (HAS2) 29 , was detected by RT-PCR in scutate scales and analysis of transcriptome data 28 suggested that it is transcriptionally upregulated during in vitro differentiation of keratinocytes isolated from chicken leg skin (Fig. 1A). An antibody raised against a carboxy-terminal peptide of the KRT9LC2 protein ( Supplementary Fig. S1) detected a prominent band at the predicted size of 51 kD in protein extracts from the stratified epidermis of an in vitro skin model but not in extracts from monolayer cultures of undifferentiated chicken keratinocytes (Fig. 1B). The KRT9LC2 protein was also detected in extracts from scutate scales but not in back skin or reticulate scales of chickens (Fig. 1C).
Immunohistochemical staining yielded a strong KRT9LC2 signal in the suprabasal keratinocytes of scutate scales whereas interscale epidermis was not immunostained (Fig. 1D,E). When the primary antibody was replaced by preimmune serum, there was no immunostaining, confirming the absence of unspecific staining (Fig. 1F). The reticulate scales were immunonegative (Fig. 1G). These results demonstrated that KRT9LC2 is a marker of differentiated keratinocytes that form the hard outer surface of scutate scales.
scRNA-seq analysis reveals two distinct populations of differentiated keratinocytes in chicken scutate scales. To characterize gene expression during keratinocyte differentiation in chicken scutate scales, we isolated cells from the legs of 1-day old chicks (n = 3) and subjected them to single-cell RNA-sequencing (scRNA-seq). The protocol was designed to enrich for epidermal keratinocytes but smaller populations of fibroblasts, smooth muscle cells, endothelial cells, Schwann cells and erythrocytes were also detected ( Fig. 2A, Supplementary Fig. S2).
According to nearest neighbor clustering as implemented in Seurat 30 , keratinocytes were divided into 5 clusters (Supplementary Table S1). Clusters KC1, KC2 and KC3 ( Fig. 2A) represented non-differentiated cells, characterized by high expression levels of KRT14L1 (Fig. 2B) 29 , which is the avian homolog of KRT14, a marker of the basal epidermal layer in mammals. KRT9L3 (Fig. 2C), a cysteine-poor keratin upregulated during differentiation of chicken keratinocytes in in vitro skin models 28 , was expressed at high levels in cluster KC4 ( Fig. 2A), for which CBP63 was defined as another marker gene (Supplementary Table S1). Expression of CBP63 (GenBank Gene ID: 101751614), previously referred to as "β-keratin, Chr25, Ktn13" 27 , was demonstrated by mRNA in situ hybridization in the interscale epidermis of scutate scales 27 . Therefore, cluster KC4 contained keratinocytes of the soft interscale epidermis. Cluster KC5 ( Fig. 2A) was defined by marker genes such as KRT9LC2 (Supplementary  Table S1; Fig. 2D). From the immunolocalization of KRT9LC2 (Fig. 1D) we inferred that KRT9LC2-positive cells represented the hard scales. The clustering of cells was reproduced in the 3 biological replicates investigated in this study ( Supplementary Fig. S3).
Keratinocyte differentiation is associated with the expression of distinct genes in scale and interscale segments of chicken scutate scales. To determine genes that are upregulated during keratinocyte differentiation in scutate scales, we compared gene expression in cells containing KRT14L1 transcripts, marking the non-differentiated state of keratinocytes, versus gene expression in cells positive for one or both of the two differentiation markers defined above, i.e. KRT9L3 and KRT9LC2. In KRT9L3-positive cells 219 genes were expressed at higher levels than in KRT14L1-positive cells (Fig. 3A, Supplementary Table S2), and in KRT9LC2-positive cells 213 genes were upregulated (> 0.25 Log2-fold average upregulation, P < 0.001) (Fig. 3B, Supplementary Table S3). The majority of these genes (n = 133), including the type II keratin, KRT78L2, the epidermal differentiation complex gene EDQL ( Supplementary Fig. S4), homologs of mammalian keratinocyte differentiation-associated genes, such as DSP, FABP5, POF1B, and others (Supplementary Tables S2 and S3) were upregulated both in KRT9L3-positive and in KRT9LC2-positive cells relative to KRT14L1-positive cells.
To identify specific markers of hard and soft epidermal differentiation in scutate scales, we compared gene expression levels in KRT9LC2-positive versus KRT9L3-positive cells and determined the genes that differed most strongly with regard to their expression in these cells (Tables 1 and 2, Supplementary Fig. S4). KRT9LC2-positive cells accumulated, amongst others, the cysteine-rich keratin KRT9LC1 ( Supplementary Fig. S4B), scale-associated CBPs such as CBP53 ( Supplementary Fig. S4F), the lectin LGASL1, and HOPX, whose mammalian ortholog regulates keratinocyte differentiation 31 (Table 1). Likewise, CTNNB1, previously reported as a regulator of scutate scale development, was found enriched in differentiated keratinocytes of hard scales (Table 1). KRT9L3-positive cells accumulated cysteine-poor keratin KRT9L4 ( Supplementary Fig. S4A), keratinocyte-associated CBPs such as CBP62 and CBP63 ( Supplementary Fig. S4E), the lectin LGAS1, and PRDX1, an antioxidant enzyme 32 (Table 2). Thus, the results of this study suggest catalogues of genes associated with keratinocyte differentiation in hard epidermal segments (Table 1) and genes associated with keratinocyte differentiation in soft interscale segments (

Discussion
Differentiation of keratinocytes underlies the growth of epithelial skin structures, such as claws, hair, feathers and scales of mammals, reptiles and birds. The molecular control of keratinocyte differentiation is well characterized for mammalian interfollicular epidermis and skin appendages [33][34][35] , whereas little is known about the genetic regulation of keratinocyte differentiation in sauropsids. The results of the present study shed light into keratinocyte differentiation in scutate scales and provide a basis for the comparative analysis of further epithelial cell differentiation processes in avian claws, beak and feathers. www.nature.com/scientificreports/ We have used scRNA-seq to characterize two types of keratinocyte differentiation, leading to the hard outer surface and the soft interscale epidermis of scutate scales. The methodology was adapted from successful scRNAseq analyses of human and mouse skin [36][37][38][39] . scRNA-seq of mouse tail skin revealed two paths of keratinocyte differentiation into scale and interscale epidermis 40 . Of note, hard scales of the mouse tail were found to contain transcripts of cysteine-rich keratins such as KRT31 whereas the soft interscale regions contained epidermal keratins such as KRT10 and epidermal differentiation complex (EDC) genes such as involucrin 40,41 . In contrast to the availability of many antibodies against mouse keratinocyte proteins, we had only one antibody, anti-KRT9LC2, specific for a keratin expressed in chicken scutate scales. This was a significant limitation of the present study. We were able to ascertain the expression of KRT9LC2 in differentiated keratinocytes of hard scales, and mRNA in situ hybridization data published by Wu et al. 2015 supported the expression of CBP63 in interscale epidermis 27 . However, other putative differentiation markers, that are suggested by our results, remain to be localized in situ in future studies.
Gene expression in interscale epidermis of chicken leg skin showed several similarities to gene expression in two models of chicken epidermis with a soft cornified layer 28 . scRNA-seq analysis of chicken back skin and bulk RNA-seq analysis of an organotypic model of chicken skin revealed expression of EDC genes and cysteine-poor but not cysteine-rich keratins 28 . Many of the genes upregulated during differentiation of back skin keratinocytes 28 , such as KRT9L4, LOR1, KRT9L3, BDH1L, EDQM2, SPTSSB, EDQM1, AADACL4B, LIPML2, and ELOVL4 (Table 2) were enriched in interscale versus scale epidermis. Conversely, genes enriched in hard scale epidermis, such as KRT9LC2, KRT9LC1, CBP53-S, CBP54-S, CBP55-S, EDMTF1, and MT4 (Table 1) were not upregulated during differentiation of back skin keratinocytes 28 . Therefore, we conclude that the genetic program of keratinocyte differentiation in the soft interscale epidermis of scutate scales is similar to the keratinocyte differentiation program in the soft back epidermis. www.nature.com/scientificreports/ Keratinocyte differentiation in the hard outer surface of scutate scales differs substantially from that in the interscale regions. The results of the present study establish KRT9LC2, also referred to as HAS2 keratin 29 , as a protein marker of the hard scutate scales and identify other genes that are co-regulated with KRT9LC2. Among these scale-associated genes was KRT9LC1 (GenBank Gene ID:772,080) ( Supplementary Fig. S3B), also referred to as Hard Acidic Sauropsid-specific 1 (HAS1) keratin 29 . In situ hybridization of transcripts corresponding to this gene (then named KRT13A) demonstrated predominant expression in the outer surface epithelium of scutate scales 27 , thus validating our scRNA-seq data. Another gene co-expressed with KRT9LC2 was CBP55-S, a scale CBP (beta-keratin). In the aforementioned study of Wu and colleagues 27 , expression of this gene (then named β-keratin, Chr25, Scale18) was detected by in situ hybridization specifically in the outer surface epithelium of scutate scales, further validating our scRNA-seq data.
An important result of this study is the genome-wide gene expression catalog of scutate scale epidermis resolved at the single-cell level. In addition to the genes discussed above, many more genes with differentiationdependent expression were identified both in soft interscale epidermis (Supplementary Table S2) and hard scales (Supplementary Table S3). With regard to ongoing efforts to characterize evolutionarily ancient and derived patterns of gene expression during epidermal keratinocyte differentiation 28,29,[42][43][44][45][46][47][48] , the results of the present study support the hypothesis that EDC genes, anti-inflammatory interleukin 1 family cytokines (IL-36RN and IL-1RN) (Supplementary Tables S2 and S3; Supplementary Fig. S5) and lipid metabolism and lipid transport-related genes, such as FABP5 and GLTP 28 belong to the common keratinocyte differentiation program of amniotes. The transcriptome data generated in this study will be particularly useful for characterizing the process of hard cornification in a non-mammalian model species. The single-cell transcriptomes of chicken leg skin are now accessible for data searches according to criteria not limited to keratinocyte differentiation, so that new research questions pertaining to avian skin biology can be addressed in future studies.

Methods
Tissue preparation and scRNA-seq analysis. One day old chicks (strain Lohmann) were obtained from Schropper GmbH, Gloggnitz, Austria. Skin was excised from the leg of sacrificed animals and incubated in thermolysin (0.5 mg/ml) (Sigma-Aldrich) at 37 °C for 45 min. The lower dermis was removed using forceps and the remaining tissue, including the epidermis and parts of the upper dermis, was processed further according to protocols established for human skin 39,49 . For the isolation of cells, the tissue was split into two fractions that were incubated either in buffer-enzyme mix of the Whole Skin Dissociation Kit human (MACS Milteny Biotech) for 2.5 h at 37 °C or with 0.05% trypsin/EDTA (Thermo Fisher Scientific) and DNase 1 (10 µg/ml) (Roche Diagnostics) at 37 °C for 15 min. Afterwards the samples were combined and processed according to the manufacturer's protocol (Whole Skin Dissociation Kit human, MACS Milteny Biotech). In brief, the epidermis-  Analysis of scRNA-sequencing data. We distinguished between background noise and droplets containing cells using emptyDrops 50 . Briefly, this method models ambient RNA background in the data set and tests for deviations from the background RNA. We used a false discovery rate of 0.05 to call cells to be included into further analysis. On the other end of the spectrum we used scran package to remove droplets containing more than one cell 51 . The applied approach simulates thousands of doublets by adding together two randomly chosen single cell profiles. For the doublet score calculation, cell clustering including the set randomly generated doublets is performed. Then for each cell of the original dataset, the number of simulated doublets in their neighbourhood is recoded and used as input for score calculation. We used 200 nearest neighbours for each cell and applied a threshold of doublet score > 4 to identify doublets in each dataset separately. Doublet score was log10 of the ratio between simulated doublet cells and total number of neighbours taken into consideration for each cell. The data obtained from 3 biological replicates were submitted to the NCBI Gene Expression Omnibus (GEO) database under accession numbers GSE179690 (BioProject PRJNA744554). The individual samples were referred to as "leg skin 1" (BioSample: SAMN20109848, SRA: SRX11375855), "leg skin 2" (BioSample: SAMN20109847, SRA: SRX11375856) and "leg skin 3" (BioSample: SAMN20109846, SRA: SRX11375857). Subsequent to individual quality control of samples, raw read counts of across datasets were combined to one count matrix. Reads from 11,779 cells were included in the final analysis. We used Pearson residuals derived from a generalized negative binomial model of UMI counts as input for principal component analysis and differential gene expression analysis. This approach is implemented in the R package Seurat as sctransform 52 . Apart from cellular sequencing depth, which is added by default to the regression model, we also adjusted for mitochondrial RNA content, batch, and cell cycle score. Calculation of cell cycle scores was performed as implemented in Seurat package where gene expression of cell cycle marker genes are combined to score. The score consisted of 43 genes primarily expressed in G1/S and 55 primarily expressed in G2/M 53 . We removed cells with a mitochondrial RNA content above 15%. Batch correction between individual datasets was done as part of the principal component analysis using the Harmony algorithm 54 . Harmony starts off with user supplied information about batch, then uses fuzzy clustering to assign each cell to multiple clusters in a way that batch diversity in each cluster is maximized. To get a correction factor for each cell, global and batch-specific centroids are calculated for each cluster. Data are corrected for batch and the procedure is repeated until convergence of global and batch-specific centroids 54 . We used PC 1-15 for subsequent UMAP analysis and nearest neighbor based clustering at a resolution of 0.2.
For further analysis of differentially expressed genes (DEGs), three different clusters, characterized by a certain expression threshold of marker genes, e.g. KRT9L3 > 2, KRT9LC2 > 1.5 and KRT14L1 > 1.5, were created. Gene expression levels in different clusters were compared and adjusted P values were calculated according to the standard algorithm implemented in Seurat 30 .
Quantitative reverse-transcription polymerase chain reaction. RNA was isolated from chicken tissues and skin models 28 46 . The expression levels of these genes were compared between scutate scales and other tissues, considering differences with a P value of < 0.05 significant (two-sided t-test).
Western blot analysis. Proteins were prepared from chicken skin and scales by treatment with the Precellys system (VWR, International, Radnor, PA) and from chicken keratinocytes cultured in vitro 28 by sonication in Laemmli buffer containing 2% SDS. Thirty µg of protein per lane were electrophoresed through an ExcelGel SDS 8-18% polyacrylamide gel (GE Healthcare Life Sciences) and afterwards blotted onto a nitrocellulose membrane (GVS Life Sciences). Subsequently, the membrane was blocked with phosphate-buffered saline containing 5% milk powder (Sigma-Aldrich), 2% bovine serum albumin (Sigma-Aldrich) and 0.1% Tween (Sigma-Aldrich) at room temperature for one hour, and incubated with mouse anti-KRT9LC2 antibody (1:500) that was raised in mice by immunization with a synthetic peptide CAAAEIQVPCRRICD, corresponding to the carboxy-terminus of the protein (GenBank accession number XP_418162.6, GenBank definition: keratin, type I cytoskeletal 19) ( Supplementary Fig. S1) conjugated to keyhole limpet hemocyanin, according to a published protocol 55 . After overnight incubation at 4 °C, the membrane was washed and sheep anti-mouse immunoglobulin G (1:10,000, GE Healthcare UK Limited) coupled with horseradish peroxidase used as secondary antibody at room temperature for one hour. The chemiluminescence system (Clarity Western ECL Substrate, BioRad) served for the protein detection. For loading control, the membrane was reincubated with anti-mouse GAPDH (1:5000, HyTest) and sheep anti-mouse immunoglobulin G (1:10,000, GE Healthcare UK Limited), coupled with horseradish peroxidase. The recordings of the chemiluminescence signal over the entire blots are shown in Supplementary  Fig. S6 and the relevant portions thereof are depicted in Fig. 1C,D.
Immunohistochemistry. Immunohistochemistry was performed according to published protocols with modifications 46 . In brief, chicken tissue samples were fixed in 7.5% formaldehyde and embedded in paraffin. Citrate buffer pH6 (DAKO) was used to retrieve the antigens and mouse anti-KRT9LC2 antibody (1:500) as primary antibody. To block unspecific binding, 10% sheep serum was added to secondary sheep anti-mouse immunoglobulin G (GE Healthcare), and further the nuclei were counterstained with haematoxylin. For control experiments, the primary antibody was replaced by the pre-immune serum.
Ethics statement. All animal procedures were approved by the Animal Care and Use Committee of the Medical University of Vienna. All procedures were performed in accordance with the guidelines established by this committee and in adherence to the ARRIVE guidelines 56 .

Data availability
Single-cell transcriptomes generated in this study are available at GEO under accession number GSE179690. All other data generated or analysed during this study are included in this published article and its Supplementary Information files.  Table 2. Gene expression in KRT9L3 + versus KRT9LC2 + cells: Genes upregulated in KRT9L3 + keratinocytes (interscale epidermis).