Combined proteomics and transcriptomics reveal the genetic basis underlying the differentiation of skin appendages and immunity in pangolin

Pangolin (Mains javanica) is an interesting endangered mammal with special morphological characteristics. Here, we applied proteomics and transcriptomics to explore the differentiation of pangolin skin appendages at two developmental stages and to compare gene expression profiles between abdomen hair and dorsal scale tissues. We identified 4,311 genes and 91 proteins differentially expressed between scale-type and hair-type tissue, of which 6 genes were shared by the transcriptome and proteome. Differentiation altered the abundance of hundreds of proteins and mRNA in the two types of skin appendages, many of which are involved in keratinocyte differentiation, epidermal cell differentiation, and multicellular organism development based on GO enrichment analysis, and FoxO, MAPK, and p53 signalling pathways based on KEGG enrichment analysis. DEGs in scale-type tissues were also significantly enriched in immune-related terms and pathways compared with that in hair-type tissues. Thus, we propose that pangolins have a normal skin innate immune system. Compared with the abdomen, the back skin of pangolins had more genes involved in the regulation of immune function, which may be an adaptive adjustment for the vulnerability of scaly skin to infection and injury. This investigation provides a scientific basis for the study of development and immunity of pangolin skin, which may be helpful in the protection of wild pangolin in China.

ages, we harvested skin samples from M. javanica in biological replicates with 7 specimens including 2 embryos and 5 adult specimens at two developmental stages for two skin appendage types: embryo/adult-hair and embryo/adult-scale group (Fig. 1a). The 14 samples were divided into identical pools and subjected to label-free MS-based proteomics and RNA-seq-based transcriptomics. The adult-hair3 sample in the sequencing results was of poor data quality; thus, this sample was deleted from our transcriptome data.
The Illumina sequencing reads were deposited in the Short Read Archive as accession number PRJNA610466 for M. javanica. For the two types of pangolin skin, a total of 84,182,093 clean reads were obtained, corresponding to 90.54% of all high-quality reference genes of the M. javanica genome (NCBI BioProjects: PRJNA335369; Table 1). A total of 36,043 annotated RNAs were obtained, of which 21,302 were protein-encoding. A neural network graph based on self-organizing map (SOM) analysis 20 revealed dynamic changes of gene expression occurring at the individual developmental stages (Fig. 1b). We evaluated biological reproducibility by comparing the biological replicates individually by principle component analysis (PCA) with all skin appendages (Fig. 1c). As expected, a similar expression pattern was observed for some of the genes in both hair-and scale-type samples.
Based on these results, a total of 8,854 new genes were discovered in the transcriptomic datasets (Supplementary Table 1), of which 7,800 were annotated by GO and KEGG database, 38 new genes were identified as being involved in KEGG pathways related to skin appendage development and differentiation, such as apoptosis, cell cycle, Hedgehog signalling pathway, etc. The new genes provide new information for studying the differentiation mechanisms of pangolins.
To complement the transcriptomic analyses, we carried out a systematic label-free quantitative proteomic analysis. Following MaxQuant analysis, peptide sequences were searched against the transcriptome reference dataset. In total, 2,904 proteins containing 26,472 peptides were successfully identified from 175,115 matched spectra, of which 2,643 proteins were shared by the transcriptome and proteome data.

DEGs related to abdominal hair development of M. javanica.
To further understand the genetic basis of abdominal hair development of pangolin, we conducted DEGs analysis before (embryo-hair) and after (adult-hair) occurrence of abdominal hair. In total, 2,248 DEGs were identified between embryo-hair vs adulthair group; among them, 862 genes were up-regulated in adult-hair group and the remaining 1,386 genes were up-regulated in embryo-hair group (Supplementary Table 2). Based on the annotation of the GO database, the up-regulated DEGs in embryo-hair group were significantly enriched in the component categories related to transporter activity, including transmembrane transporter activity, substrate-specific transporter activity, ion transport, channel activity, etc. However, the up-regulated DEGs in adult-hair group were significantly enriched in categories related to filaments, such as    (Fig. 2a).
For the KEGG enrichment analysis, a total of 161 up-regulated and 117 down-regulated genes in embryo-hair group of 16 unique KEGG pathways related to the hair development process were identified. Among these genes, PI3K-Akt and the ECM-receptor interaction signalling pathway were significantly enriched, and several enriched pathways related to hair development such as focal adhesion, hedgehog, Wnt pathways, etc., were identified (Fig. 2b). It is noteworthy that in embryo-hair group, the up-regulated genes FGF10, VEGFC, FGF16, FGF18, FGFR4, THBS4, EREG, MCL1, and AREG do not only play an important regulatory role in the development of abdominal hair in pangolin, but they also participate in various processes including cellular apoptosis, proliferation, adhesion and attachment, and the inflammatory response. Furthermore, they are involved in epidermal growth, animal organ morphogenesis, and keratinocyte proliferation in GO annotation.
We conducted distinct expressed proteins (DEPs) analysis before and after hair development based on our proteome data. In total, 67 DEPs (embryo-hair vs adult-hair group) were identified, of which 34 proteins were shared with DEGs in the transcriptome (Supplementary Table 5). Among these DEPs, were proteins involved in cell attachment, controlling growth, and intermediate filaments in GO annotation, including 2 up-regulated genes in embryo-hair group: COL11A1 and POSTN, and 5 up-regulated genes in adult-hair group: FCGR2B, PTBP3, DSC1, CDSN, and CSTA.

DEGs related to dorsal scale development of M. javanica.
To identify genes that activate the development of pangolin scales, we analysed DEGs before (embryo-scale) and after (adult-scale) the occurrence of dorsal scale. A total of 2,556 DEGs were identified between embryo-scale vs adult-scale group, of which 875 genes were up-regulated in adult-scale group, and the remaining 1,681 genes were up-regulated in embryo-scale group. In the GO enrichment analysis with all DEGs, 7 terms related to the extracellular region and metallopeptidase activity were significantly enriched. In addition, 15 terms related to development and growth were mainly enriched in growth factor activity, regulation of cell growth, multicellular organism development, anatomical structure development, etc. (Fig. 3a). In the KEGG enrichment analysis, genes associated with PI3-Akt, ECM-receptor interaction, Hedgehog, and focal adhesion signalling pathways were significantly enriched (Fig. 3b). In addition, there were 16 signalling pathways related to scale development (Table 2). Notably, FGF1, SGK1, IGFBP3, TEAD4, SERPINE1, and KRAS genes were highly expressed in adult-scale group. We speculate that these genes play an important role in activating the development of pangolin scales.
We conducted DEPs analysis before and after scale development (embryo-scale vs adult-scale group) based on our proteome data. In total, 54 DEPs were identified, of which 17 proteins were shared with DEGs in the transcriptome (Supplementary Table 6). According to the gene description, 3 proteins (COL16A1, RAB21, and MAPRE1) were related to cell spreading, alterations in cell morphology, and cell adhesion, respectively. DEGs related to skin appendage differentiation of M. javanica. To explore genes controlling the differentiation of skin appendages in pangolin, we compared the gene expression profiles in skin appendages    Table 8). It is noteworthy that the Notch signalling pathway and PPDPF, DLL1, JAG1, VANGL2, WNT9A, LOC108401680, NOTCH1, and NOTCH3 gene were found in both annotation results, and are closely related to the differentiation of keratinocytes and epidermal cells.
According to our proteome data, differential proteome analysis identified 91 distinct significant proteins, including 73 up-regulated proteins in adult-hair group and 18 up-regulated proteins in adult-scale tissues (Fig. 5a). In the GO enrichment analysis with all DEPs, the CDSN gene was enriched in the skin morphogenesis term, which is also involved in biological processes related to skin development and keratinocyte differentiation. JUP, DSC1, CDH1, and TLN2 genes were enriched in cell adhesion and adherens junction terms, which were also involved in cell migration and regulation of cell population proliferation (Fig. 5b). In the KEGG enrichment analysis with all DEPs, 41 proteins in 21 pathways were related to the development and differentiation of skin appendages (Fig. 5c). We also analysed the subcellular localisation of all distinct proteins and found that the proportion of extracellular proteins was the highest (26.5%), followed by cytoplasmic proteins (17.7%) (Fig. 5d).
The mRNA information obtained from the transcriptome was integrated with the protein information identified by the proteome to identify any corresponding relationships. A total of 1,144 IDs were identified by both RNA-seq and proteomics (Fig. 6a). There were 6 distinct proteins corresponding to transcriptome DEGs: CSTA, XDH, PSMC5, KLK10, VCP, and MYOZ1. The correlation between the different multiples of genes (proteins) identified by transcriptome and proteome analysis in the two groups was analysed (Fig. 6b). The Pearson correlation coefficient between mRNA and the corresponding protein was positive (Pearson = 0.461). Thus, we conclude that it is important to determine the protein level to understand phenotypic changes and to not rely solely on mRNA levels.
Immune-related genes in the epidermis of M. javanica. In this study, many immune-related genes were also identified in tissues of skin appendages from different stages of development in pangolin. Through www.nature.com/scientificreports/ GO enrichment analysis, the immune-related DEGs before and after hair growth (embryo-hair vs. adult-hair group), 34 up-regulated genes and 4 immune-related biological processes were identified as enriched in adulthair group. These included defence response, inflammatory response, immune system, and immune response, while only 4 immune-related up-regulated genes and two terms were enriched in embryo-hair group (Fig. 7a).
The results obtained from KEGG enrichment analysis were similar to those of the GO enrichment analysis. Compared with the embryonic samples (embryo-hair: 15 signalling pathways and 61 immune genes), more immune-related pathways and genes (adult-hair: 17 signalling pathways and 123 immune genes) were enriched in the adult-hair group (Fig. 7b).
Through GO enrichment analysis with all DEGs, before and after scale growth (embryo-scale vs. adult-scale group), we found that 17 up-regulated genes and 3 immune-related biological processes were enriched in adultscale group. These included regulation of the immune system process, immune system, and immune response, while there was no enrichment of terms related to the immune response in embryo-scale group (Fig. 7c). The results obtained from KEGG enrichment analysis were similar to those of the GO enrichment analysis. Compared with the embryonic samples (embryo-scale: 15 signalling pathways and 61 immune genes), more immunerelated pathways and genes (adult-scale: 17 signalling pathways and 123 immune genes) were enriched in the adult-scale tissue (Fig. 7d).
We also compared and analysed the DEGs related to immunity in the dorsal and abdominal skin samples (adult-hair vs. adult-scale group) of adult pangolins. After GO enrichment analysis with all DEGs, both tissue types were found to be enriched in 2 terms related to immunity (immune system process and immune response), while the number of genes related to immunity and enriched in scale-type tissues (4 genes) was higher than that in hair-type tissues (2 genes) (Fig. 8a). In addition, we observed similar KEGG enrichment analysis results. Scale-type tissues had more immune-related pathways and genes than that found in hair-type tissues (Fig. 8b). Scale-type tissues exhibited 16 enriched immune-related pathways and 47 immune-related genes. The number of terms and genes enriched in hair-type tissues was significantly fewer than in scale-type tissues.

Discussion
This study provides the most detailed poly-omic analysis of pangolin skin to date, characterising the developmental and differential gene expression in a special mammal. Thus, we have applied both label-free quantitative proteomics and RNA-seq analyses to globally monitor changes in protein and mRNA levels in response to differentiation in both scale-type and hair-type skin. Consistent with previous findings 21 , we also found only a  www.nature.com/scientificreports/ moderate correlation between protein and corresponding mRNA levels. This means that the protein expression of pangolins is determined by the level of transcription and is also influenced by post-transcription conditions. The result of these responsible multiple regulatory pathways is that mRNA and protein levels are negatively correlated to the response of some proteins to differentiation. During the development of mammalian skin appendages, a large number of regulatory genes and molecular signalling pathways are involved, affecting the establishment of complex morphologies between the epidermis and the mesenchymal 22 . Specifically, different species of organisms can evolve a wide variety of hair types because certain regulatory genes and signalling pathways perform specific functions in different space and time 23 . Interactions between epithelial and mesenchymal tissues determine the morphological characteristics, size, and polarity of different skin appendages 24 .
Pangolin scales, which are soft on newborn pangolins but harden as the animal matures, are formed by keratins 8 , guaranteeing a high abrasive wear resistance 9 . Among the differentially expressed genes in the transcriptome, we found 17 keratin-related genes, and an up-regulated KRT2 protein was found in the differentially expressed proteins in the proteome. This is consistent with Cadieu et al. (2009) who reported that a combination of RSPO2 and keratin genes causes hair to become coarse and stiff in the developmental stage 13 .
In the KEGG enrichment analysis, 36 DEGs (adult-hair vs adult-scale group) were significantly enriched in the FoxO signalling pathway. The fork head box O (FOXO) family of transcription factors regulates the expression of genes in cellular physiological events including apoptosis, cell-cycle control, glucose metabolism, oxidative stress resistance, and longevity. In our data, TGFB2, TGFB3, MAPK11, EGFR, CNND1, BCL6, and TNFSF10 genes were identified as up-regulated DEGs in scale-type tissue. We hypothesise that these genes are involved in skin appendage development and follicle size. Jackowska et al. (2012) found that an increased TGFB1, TGFB2, and TGFB3 protein pattern and expression level in oocytes of large compared to small follicles 25 . These findings are consistent with our results here. www.nature.com/scientificreports/ In total, 89 DEGs (adult-hair vs adult-scale group) were enriched in MAPK and the cell cycle signalling pathway, which often work together in forming signalling loops in organogenesis 26 . They induce the most important fundamental biological process, the formation of periodic patterns. FGF, DUSP, VEGFA, EGF, and CDK6 have been identified as up-regulated genes in scale-type tissue. Hebert et al. (1994) demonstrated that the FGF gene plays an important role in hair cycle growth regulation, which functions as an inhibitor of hair elongation by promoting progression from anagen (the growth phase of the hair follicle) to catagen (the apoptosis-induced regression phase) 18 . Because of this, we believe that these genes may be important regulatory genes that affect the development and differentiation of scales.
Many of the genes we detected that vary in abundance in different skin types are involved in the Notch signalling pathway. These include CSTA, CDSN, PPDPF, DLL1, JAG1, VANGL2, WNT9A, LOC108401680, NOTCH1, and NOTCH3 genes, which were found in both KEGG and GO annotation results and are closely related to the differentiation of keratinocytes and epidermal cells. In particular, the up-regulated CSTA gene in adult-scale group was one of 6 DEGs in the transcriptome and proteome in the multi-omics analysis.
Through gene annotation, we not only found many genes related to the differentiation of skin appendages, but we also identified many genes and pathways related to diseases and immunity. Firstly, in the embryonic stage, a large number of genes and proteins related to immune function were detected in different skin regions, indicating that pangolins have a sound skin innate immune system. With the development and growth of an organism, its skin immune system tends to gradually improve, and the signalling pathways and regulatory genes involved in immune function become more complex. The immune function of the adult is more perfect than that of the embryo. Secondly, more immune-related genes and pathways were enriched in the back skin of the growing scales, indicating that the body may have some regulatory differences in the immune function of different skin regions. Thus, since the skin with the scales on the back looks smoother and thinner and more vulnerable to bacterial infection or other invasions, it appears that the body has differentially adapted to specifically protect this area.
Previous studies have found that some key genes related to immune function in the innate immune system of pangolins are pseudogenized or mutated, resulting in the loss of their immune pathways and functions involved in resistance to viral or bacterial invasion, and it is speculated that pangolins may have some undiscovered immune replacement compensation mechanisms 27,28 . However, we do not agree with some scholars who suggest that the evolution of pangolin scales is an innovative compensation for low immune function of the back skin 7 . This is because our data, to date, show that although a few immune genes (interferon) have been pseudogenised, a large number of complex immune-related signalling pathways and their gene participants normally performing immune functions can still be detected in the skin samples of pangolins. Therefore, we are more inclined to speculate that scale evolution of pangolin is more of an adaptive response to its living environment, enabling it to better adapt to burrowing, especially burrowing in other habits. Furthermore, scale evolution enables pangolin to walk better and more freely in caves and better protect the body from the wear of sand and stones. The gradual decline of the pangolin population is not due to its own low immunity, but it is more likely due to the irresistible hunting and habitat destruction by humans.
In summary, this study combined proteomics and transcriptomics to provide a comprehensive analysis of skin scale development and differentiation in pangolins. Our data highlight the correlation between the combined analysis of protein and transcript levels to understand the mechanisms that regulate gene expression in complex phenotypic responses. We expect that the data set reported in this study will provide a valuable resource for the scientific community and will facilitate future experimental design to study the development, differentiation, and even evolution of pangolin skin appendages as an adaptation to their environment.

conclusions
In this study, we applied transcriptome and proteome analysis of a pangolin species to explore the genetic basis of the development and differentiation of skin appendages. We have shown that the skin of pangolins has a healthy immune function. Our overall transcriptome and proteome analysis provided a rich list of genes and proteins expressed in hair-type and scale-type tissue of M. javanica. In total, 4,311 DEGs and 91 DEPs were screened from two types of skin appendages. Candidate genes related to 20 key signalling pathways are likely involved in regulating the growth and differentiation of the hair and scales. These molecular and signalling pathways greatly contribute to current genetic resources for pangolin and certain mammalian species with shaggy appendages and may help to understand human disease related to skin.

Materials and methods
Biological samples. Fourteen skin samples from the dorsolateral (hair-type) and the abdominal (scaletype) regions of M. javanica were collected from Guangdong Provincial Wildlife Rescue Centre. These individuals came from a wild population intercepted by customs and died of natural causes in the process of rescue. Pangolin hair and scale tissues were collected from 2 embryos (embryo-hair/scale) and 5 adult (adult-hair/scale) specimens, these samples included a small amount of epidermis and its attached skin appendages. 14 skin tissues representing the two types of appendages (7 hair-type and 7 scale-type) were rapidly excised, immediately snapfrozen on dry ice, and stored at − 80 °C until RNA extraction.
RnA extraction and RnA-seq. Total RNA from each tissue sample was extracted using the RNeasy Kit (Qiagen, Germany). RNA purity was checked using a NanoPhotometer spectrophotometer (IMPLEN, CA, USA). RNA integrity was assessed using the RNA Nano 6000 assay kit of the Bioanalyzer 2,100 system (Agilent Technologies, CA, USA). RNA concentration (ng/μl) was determined using the Qubit RNA assay kit and Qubit 2.0 Fluorometer (Life Technologies, CA, USA). Qualified RNAs were used for cDNA library construction and Scientific RepoRtS | (2020) 10:14566 | https://doi.org/10.1038/s41598-020-71513-w www.nature.com/scientificreports/ sequencing 29 . To examine the similarity between the different organ transcriptomes, the expression levels of the transcripts (FPKM) in the transcriptomes of each tissue were manipulated using the tool 'RSEM-calculateexpression' in the RSEM pipeline (https ://dewey lab.biost at.wisc.edu/rsem/READM E.html), which performs accurate transcript quantification from RNA-Seq data with a reference genome [30][31][32] . All of the sequencing stages and methods were conducted by Novogene Sequencing Company (Beijing, China).
total protein extraction. Tissue samples were ground individually in liquid nitrogen and lysed with lysis buffer containing 100 mM NH 4 HCO 3 (pH 8), 6 M urea and 0.2% SDS, followed by 5 min of ultrasonication on ice. The lysate was centrifuged at 12,000 g for 15 min at 4 °C and the supernatant was transferred to a clean tube. Extracts from each sample were reduced with 10 mM DTT for 1 h at 56 °C, and subsequently alkylated with sufficient iodoacetamide for 1 h at room temperature in the dark. Samples were subsequently mixed with 4-times volume of precooled acetone by vortexing before incubating at − 20 °C for at least 2 h. Samples were then centrifuged and the precipitate was collected. After washing twice with cold acetone, the pellet was dissolved in dissolution buffer containing 0.1 M triethylammonium bicarbonate (TEAB, pH 8.5) and 6 M urea 33-35 . Label-free quantitative protein analysis. The resulting spectra from each fraction were searched separately against X101SC19091078-Z01.fasta (41,843 sequences) database using the search engine Proteome Discoverer 2.2 (PD 2.2, Thermo). The search parameters were set as follows: mass tolerance for precursor ion was 10 ppm and mass tolerance for product ion was 0.02 Da. Carbamidomethyl was specified in PD 2.2 as a fixed modification. Oxidation of methionine (M) and acetylation of the N-terminus were specified in PD 2.2 as variable modifications. A maximum of two missed cleavage sites was allowed.
Identified proteins contained at least one unique peptide with FDR no more than 1.0%. Proteins containing similar peptides that could not be distinguished by mass spectrometry (MS)/MS analysis were identified as the same protein group. Precursor ion was quantified by the label-free method based on intensity and was used for label-free quantification. The protein quantitation results were statistically analysed by the Mann-Whitney Test. Proteins whose quantitation significantly differed between experimental and control groups (p < 0.05 and |log2FC|> 1.5 were up-regulated proteins p < 0.05 and |log2FC|≤ 0.67 were down-regulated proteins) were defined as differentially expressed proteins (DEPs).
Gene Ontology (GO) and InterPro (IPR) analysis were conducted using the InterProScan-5 program against the non-redundant protein database (including Pfam, PRINTS, ProDom, SMART, ProSiteProfiles, PANTHER) 36 , and the databases COG (Clusters of Orthologous Groups) and KEGG (Kyoto Encyclopedia of Genes and Genomes) were used to analyse the protein family and pathway. The probable protein-protein interactions were predicted using the STRING-db server 37 (https ://strin g.embl.de/). The enrichment pipeline 38 was used for enrichment analysis of GO, IPR, and KEGG. All sequencing stages were conducted at Novogene Sequencing Company (Beijing, China). ethics statement. All animal procedures in this study were approved by the ethics committee for animal experiments at the Guangdong Institute of Applied Biological Resources (reference number GIABR20170523) and followed basic principles. We confirm that all methods were performed in accordance with the relevant guidelines and regulations. ethics approval and consent to participate. All animal procedures were approved by the ethics committee for animal experiments at the Guangdong Institute of Applied Biological Resources (reference number GIABR20170523).

Data availability
Data analyzed in the current study are included within the article and its supplementary material. All unigene sequences from M. javanica have been deposited in the GenBank Sequence Read Archive (SRA) under accession number PRJNA610466 for SUB7099543. We have uploaded supplemental material to figshare via the GSA Portal.