The intragenic mRNA-microRNA regulatory network during telogen-anagen hair follicle transition in the cashmere goat

It is widely accepted that the periodic cycle of hair follicles is controlled by the biological clock, but the molecular regulatory mechanisms of the hair follicle cycle have not been thoroughly studied. The secondary hair follicle of the cashmere goat is characterized by seasonal periodic changes throughout life. In the hair follicle cycle, the initiation of hair follicles is of great significance for hair follicle regeneration. To provide a reference for hair follicle research, our study compared differences in mRNA expression and microRNA expression during the growth and repose stages of cashmere goat skin samples. Through microRNA and mRNA association analysis, we found microRNAs and target genes that play major regulatory roles in hair follicle initiation. We further constructed an mRNA-microRNA interaction network and found that hair follicle initiation and development were related to MiR-195 and the genes CHP1, SMAD2, FZD6 and SIAH1.

Transcriptome and MicroRNA data annotation. The amounts of transcriptome data in the telogen (T-1, T-2, and T-3) and anagen (A-1, A-2, and A-3) stages were 2.6 Gb, 2.3 Gb, 2.4 Gb, 2.4 Gb, 3.2 Gb and 2.4 Gb, respectively. After filtering out the low-quality sequences, we used Trinity software for de novo splicing to obtain the following amounts of transcriptome data: 198,542 bp, 219,454 bp, 219,411 bp, 247,695 bp, 250,081 bp, 247,695 bp and 203,451 bp. The total number of genes was 55,541, the number of alternative splices was 105,854, the average sequence length was 1965.15 bp, the length of the longest sequence was 23,311 bp, and the length of the shortest sequence was 351 bp.
The genes that control skin development differed significantly in the course of one year ( Table 1). The first significant difference in gene number occurred from February to March, and the second occurred from March to April; thus, in March, there was a period of significant change for the genes in the skin, and these changes were relatively independent. The third significant change was from June to July, and the fourth significant change was October to November. After November and until February of the following year, changes in skin genes were almost absent.
From sRNA sequencing of skin tissue samples from the repose and growth stages, we obtained telogen (T-1 Table 2) that could be used for subsequent analysis.
We also performed statistical analysis of the length distribution of the high-quality sRNA reads (Fig. 2b), and the results showed that total reads were primarily 22 nt long, accounting for 31.46% of reads, and that the proportions of reads that were 20, 21, 22, and 23 nt in length among distinct reads were 14.4%, 14.01%, 13.54% and 12.43%, respectively. Comparing our high-quality sequencing data to known data in the MiRbase database (Fig. 2c), we obtained 437,517 known microRNA sequences, accounting for 47.41% of sequences, and 4,852,979 unknown microRNA sequences, accounting for 52.29% of sequences, which included rRNAs, tRNAs and snoRNAs, among others. There were 333 precursor microRNAs and 470 mature body microRNAs, which were classified into 150 families. We carried out a forecast of these new miRNAs and found that there were 92 precursor microRNAs and 99 mature body microRNAs. Differential analysis of the transcriptome and miRNAs. From a differential analysis of the differentially expressed genes that were related to villus growth in March and July (Table 1), we obtained 12,865 differentially expressed genes in the telogen and anagen stages, of which there were 7,664 upregulated and 5,201 downregulated genes (Fig. 2a). A differential analysis of mature microRNAs expressed during the growth and rest periods revealed 35 microRNA genes that were upregulated in the telogen stage compared to their expression in the anagen stage, and there were 16 genes with more than 100-fold higher expression in the anagen stage. We also found 9 downregulated microRNA genes in the anagen stage relative to the telogen stage, and there were 6 genes with more than 100-fold higher expression in the telogen stage: MiR-148a, MiR-34b, MiR-195, MiR-335, MiR-7g*-1, and MiR-101*-1. Thus, we successfully identified the major microRNAs and their target genes.

Mutual analysis of microRNAs and mRNAs.
The relationship between microRNA and mRNA is complex: one microRNA can target multiple mRNAs, and vice versa. A target gene can be regulated by multiple microRNAs simultaneously, and the relationship between microRNAs and mRNAs is not always limited to negative regulation. The mechanism by which microRNAs regulate mRNAs is not yet clear. This study explored the expression correlations between microRNAs and mRNAs through linear programming and determined the values of the differential multipliers for microRNAs and mRNAs (Fig. 3). Above all, the points in the second quadrant and the fourth quadrant were negatively associated with miRNA and mRNA. These miRNAs that were negatively related to mRNA will be our next area of study.
Functional enrichment analysis of differentially expressed microRNA target genes. We also performed GO annotation and KEGG pathway analysis. As the results show, the GO database ( Fig. 4a) included 8,409 differentially expressed mRNAs in the telogen relative to the anagen stage, of which there were 3,205 differentially expressed mRNAs upregulating microRNAs, and the number of differentially expressed mRNAs downregulating microRNAs was 4,723. The number of microRNAs negatively regulating their target mRNAs was 2,025. The KEGG database (Fig. 4b) included 5,432 differentially expressed mRNAs between the telogen and anagen stage; among these, there were 1,322 differentially expressed mRNAs upregulated by microRNAs. A collection of the most significant top 8 results for GO and KEGG enrichment is presented in Fig. 5. The GO annotations of the most concentrated microRNA target genes were as follows, from high to low: integral component of the membrane, DNA binding, nucleus, catalytic activity, nucleotide binding, transcription factor activity, transmembrane transport and the oxidation-reduction process. These processes were related to cell development, indicating that a large number of cell proliferation processes occurred during the initiation of hair follicles.
The results of target gene KEGG annotation from high to low were as follows: insulin signalling pathway, focal adhesion, cancer pathways, Wnt signalling pathway, phagosome, bacterial invasion of epithelial cells, arrhythmogenic right ventricular cardiomyopathy and regulation of the actin cytoskeleton. Similar to the results obtained with KEGG pathway enrichment, we found that the concentration of insulin signalling pathways was the highest; there were 220 genes annotated to this pathway. Although there have been no reports of the role of insulin signalling in hair follicle development to date, the insulin signalling pathway is closely related to cell growth, and its potential involvement in hair follicle development will be further studied. We also analysed the signalling pathways related to hair follicles, including the Jak-STAT signalling pathway, Notch signalling pathway and Wnt signalling pathway, which were previously reported. The corresponding numbers of genes identified by annotation were 123, 348 and 262, respectively. The function enrichment P value of the Wnt signalling pathway was 7.29E-64, which indicated a closer relationship with villus development, as shown by the large number of genes in the signalling pathway in skin tissue exhibiting targeted regulatory relationships with the differentially expressed microRNAs. Therefore, Wnt signalling plays an important role in the development of hair follicles. mRNA-microRNA network analysis. The telogen and anagen stages in cashmere goat skin showed differential expression of microRNAs and their negatively regulated target genes according to the construction of a data relation network based on GO annotation and network construction based on KEGG relationships (Fig. 6). Genes differentially expressed in the telogen and anagen stages have many functions in cashmere goats (Fig. 6a), comprising a complex regulatory network with microRNAs. With GO annotation, there were 350 upregulated genes and 13 microRNAs clustered in the lower part of the network. Additionally, 233 downregulated genes and  Table 3. MicroRNA and mRNA correlation analysis results. The KEGG pathway network diagram was similar to the GO annotation network diagram (Fig. 6b). The genes were clearly divided into two groups, with 184 upregulated genes and 13 microRNAs clustered in the lower part of the network. Fourteen genes were annotated in the Wnt signalling pathway, including calcineurin B homologous protein 1 (CHP1), E3 ubiquitin-protein ligase (SIAH1), FZD6, WNT5A and mothers against decapentaplegic homolog 2 (SMAD2), as well as MiR-195, MiR-335star and other microRNA genes. One hundred twenty-eight downregulated genes and 31 microRNAs were clustered above the network. The genes and microRNAs annotated in the Wnt signalling pathway were 2A5G, CUL1, LRP5, FOSL1, MiR-17astar, MiR-2285m, MiR-136 and MiR-34astar. Therefore, there is a complex signalling pathway between the genes expressed during rest and the genes downregulated during the growth stage.
Up-and downregulated target genes as well as the negative regulatory effects of microRNAs during the period of transition from the telogen to the anagen stage were clearly clustered into two groups. The brown dots in the network map indicate the functional annotation of target genes. Some functions in the map were annotated only for upregulated genes, some functions were annotated only for downregulated genes, and others had functions in both up-and downregulated transcription. This finding indicates that the thresholds for gene balance or gene expression and for differentially expressed genes were relatively important for a specific function. The interactions between microRNAs and their target genes played roles in the initiation and growth of cashmere. This finding further indicates that the initiation of hair follicle growth was controlled by multiple genes and that microRNAs play a role in the post-transcriptional negative regulation of target genes in hair follicle initiation.

Construction of a control network for the Wnt signalling pathway. Multiple genes and microRNAs
were involved in the Wnt signalling pathway during goat hair follicle development (Fig. 7a). In addition, the genes and microRNAs in the Wnt signalling pathway also played roles in other related pathways and interacted with other genes. In this study, we used a "biogrid" database to download the genes that interacted with target genes and then merged the genes from the Wnt signalling pathway to construct the Wnt signalling network (Fig. 7b).
Using the GO regulatory network, KEGG regulatory network and the Wnt signalling pathway regulatory gene interaction network in the goat hair follicle resting phase, we screened for genes with upregulated expression in the Wnt signalling pathway from the telogen to the anagen stage, including CHP1, FZD6, SIAH1 and other genes. Keratin and keratin-associated proteins were constitutive proteins during cashmere production. MiR-195 was MicroRNAs. The P value is the critical threshold for differential microRNA enrichment; the Rich factor is calculated for S/B, S is the mRNA with microRNA target relations annotated to a certain function or path, and B is used to annotate the mRNA of a function or path, based on the mRNA ratio. In other words, the Rich factor represents the ratio and is a percentage; the dot size represents the mRNA number, that is, S.
SCIEntIfIC REPORTS | (2018) 8:14227 | DOI:10.1038/s41598-018-31986-2 predicted using target gene prediction and found to negatively regulate KAP24-1, KAP3-1, KAT8 and the important regulatory factor EGFR, which constitute the main components of cashmere. The genes corresponding to KAP24-1, KAP3-1, KAT8 and EGFR were CHP1, FZD6, SIAH1 and SMAD2. Therefore, we chose CHP1, FZD6, SIAH1 and FZD6 as the regulatory genes during the initiation of villus growth for further verification. CHP1, FZD6, SIAH1, and SMAD2. The expression of CHP1, FZD6, SIAH, and SMAD2 was verified via fluorescence quantitative PCR, using β-actin as a reference gene. As shown in Fig. 8, compared to the expression of microRNA-195 (Fig. 8a), the expression level of the CHP1 gene (Fig. 8d) was higher in the anagen stage than in the telogen stage, and the difference was relatively significant (P < 0.01), with a descending trend from the early growth stage to the late growth stage that was not significant (P > 0.05). The expression level of the Frizzled-6 (FZD6) gene (Fig. 8e) increased from the early stage to the telogen stage, but the difference was not significant (P > 0.05). However, the expression level increased significantly from the early growth stage to the late growth stage (P < 0.01). The expression of the gene SIAH1 (Fig. 8c) increased significantly from the telogen to the early growth stage (P < 0.01), but there was no significant difference from the early growth stage to the late growth stage (P > 0.05). The SMAD2 gene (Fig. 8b) showed no significant difference from the telogen to the anagen stage  (P > 0.05), but its expression increased from the telogen to the late growth stage. In the Wnt signalling network, two of the three key genes were significantly expressed from the telogen to the anagen stage.

Discussion
Farazi T A et al. showed that a computer-aided algorithm could be used to integrate miRNA and mRNA paired expression spectra to characterize miRNA-mRNA interactions and that this approach identified miRNA targets with high accuracy. This approach effectively avoids certain software predictions that do not consider specific cases of miRNA and mRNA expression [24][25][26] . Chen et al. published an article describing a model for predicting disease associations by mixing microRNAs into computations (HAMDA), which significantly reduced experimental time and cost [27][28][29] . With HAMDA, the functions of microRNAs can be correlated with disease characteristics, and the role of new microRNAs in disease can be revealed more clearly. Additionally, the authors proposed a method of using Laplacian Regularized Sparse Subspace Learning for MicroRNA-Disease Association (LRSSLMDA), which projects a microRNA and a disease into a common subspace to interpret the function of the microRNA subspace 27,[30][31][32] . Based on these methods and on a joint analysis of miRNA-mRNA, we conducted a targeted prediction of miRNA-mRNA interactions and used reverse locations of mRNAs to predict miRNA target genes. After obtaining positive target genes, we used RT-qPCR to further verify the target genes and determine the interactions between target genes regulated by miRNAs.
In this study, target genes of differentially expressed microRNAs were analysed using GO orthology enrichment analysis and KEGG pathway enrichment 33 , as described in Chen L's study. The most commonly annotated GO function was integral component of the membrane within cellular components, indicating that microRNAs can regulate villus initiation and the villus growth cycle by regulating cell membrane-associated gene expression. The KEGG pathway results showed significant enrichment of the Jak-STAT signalling pathway, Notch signalling pathway, p53 signalling pathway and Wnt signalling pathway. These cashmere growth-related pathways were found via analysis of differentially expressed microRNA target genes in the KEGG pathway map, and the Wnt signalling pathway was found to play a particularly important role. Therefore, this study constructed the Wnt signalling pathway regulatory network during hair follicle initiation.
The American sociologist Granovetter first posited the concept of relationship intensity and divided relationships into strong and weak relationships 34 , with strong ties maintaining the relationships between groups and organizations and weak ties between groups and organizations. Information obtained from strong relationships is often highly repeatable, while weak relationships provide more information and other resources than strong relationships. Based on the ideas provided by social networks, we hypothesize that strong and weak relationships exist between genes. A gene associated with more genes in a network is probably an important gene or key node gene that controls or influences a trait 35 . Thus far, efforts such as pathway analysis have raised awareness of the functional contributions of gene mutations and DNA copy number variations to cancer development, progression and metastasis 36 . Wang E commented on the challenges associated with studying cancer omic data using an integrative network approach and suggested possible research directions. As a result, most current cancer genome sequencing work has mapped mutations onto biological pathways, including signalling pathways 37,38 . Mcgee S R indicated that by studying the distributions of such network motifs, insight into cancer signalling regulatory molecular mechanisms of tumorigenesis can be gained, and the identification of these loops has practical implications for predicting prognosis and the clinical outcomes of cancer patients. Collectively, network motifs and modules are critical for cancer signalling and are associated with clinical outcomes 39 . Additionally, as stated by Han P, analysis of GRNs can identify regional subnetworks for certain biological processes; in-cloud regulatory structures between genes, key regulators, and cancer hallmark subnetworks; network dynamics for network rewiring; and network motifs. Furthermore, such results may reveal the molecular mechanisms of signalling pathways associated with cancer hallmarks and cancer patient outcomes 38,[40][41][42] . We found a complex network between genes in the Wnt signalling pathway and other genes and signalling pathways during hair follicle development from rest to growth, indicating that hair follicle initiation is determined not by single genes but by differences in the number of interactions between genes. One or more key genes are present in this gene regulatory network. In the Wnt signalling pathway, gene interactions are centred on the CHP1, SIAH1, and SMAD2 genes; however, many inter-gene interactions and multiple signalling pathways are involved.
As indicated by previous studies, microRNAs have negative regulatory roles [43][44][45][46] . In this study, quantitative analysis of MiR-195 expression during each period of chorionic villus initiation was performed. MiR-195 showed a trend of declining expression during the periods of repose, prophase and growth. In our experimental results, we found that two genes, SMAD2 and FZD6, were negatively correlated during the periods of repose, prophase and growth. Therefore, we speculate that MiR-195 continues to inhibit the expression of SMAD2 and FZD6 during the entire initiation process. At present, no information is available regarding the regulation of SMAD2 and FZD6 targeting in by-195. However, a study by Mo J showed that MiR-195 regulates cell proliferation (Mo J. et al. 47 ). Zheng R., Du J. et al. showed that downregulated MiR-195 was targeted to promote the expression of SMAD7 in two-leaf aortic valve calcification 48,49 . MiR-195 may play an important role in the regulation of SMAD family genes. However, two genes, SIAH1 and CHP1, showed a trend that first increased and then decreased. Presumably, MiR-195-mediated inhibition of these two genes after the initiation of cashmere production is affected by other microRNAs and is relieved or reduced. A study by Zhang X showed that MiR-195 affects the proliferation of colon cancer cells and regulation of Wnt/β-catenin pathway protein-specific MiR-195 by targeting FGF2 and regulation of the Wnt/β-catenin signalling pathway, consistent with studies showing the effects of MicroRNA-195 on Wnt signalling 47,50 .

Materials and Methods
Animals. The study samples were collected from an Inner Mongolian Arbas white cashmere goat breeding farm. All animal experiments were performed in accordance with the 'Guidelines for Experimental Animals' of the Ministry of Science and Technology (Beijing, China). All surgeries were performed according to recommendations proposed by the European Commission (1997) and were approved by the experimental animal ethics committee of Inner Mongolia Agricultural University. Six cashmere goats were selected from the same growth environment and had equal body weights, unrelated relationships, were of the same sex, and had good growth conditions. The sampling times occurred during the telogen and anagen stages. The sampling position was the side body or the middle of the body at a distance of 10-15 cm from the scapula, and the sample size was 2 cm 2 . The samples were stored at −80 °C in a freezer. Original data processing and differential expression analysis. Seqprep, sickle, and condetri_v2.0.pl software were used to evaluate the original data for transcriptional sequencing, and Trinity software was used for de novo splicing and open reading frame (ORF) prediction. We predicted ORF sequences using BLAST (BLAST Version 2.2.25, with an E-value parameter value setting of less than 10 −5 ) in the NCBI NR database with BLASTP database identification of the string database and KEGG database without predicted sequences or NR, string, or gene library ORF BLASTX comparison notes. The expression of each gene in each sample was counted based on a comparison between the sequencing sample and the reference genome, and gene expression levels were calculated using the maximum likelihood method in RSEM software. EdgeR software was used to find significant differences in the expression of all transcripts in the sample at a threshold of P = 0.05.
Short oligonucleotide alignment program (SOAP) software was used to locate sRNAs in the genome. "Tag2repeat" software was used for repeated sRNA comparisons. Using overlap software, sRNAs were compared to the exons and introns of mRNAs, and sRNAs derived from mRNA degradation were found. The GenBank and Rfam (9.1) databases (database link) were used to annotate the data and remove rRNAs, scRNAs, snoRNAs, snR-NAs, and tRNAs from the sRNAs as much as possible. The sequences were compared to microRNA precursors and the mature bodies of cattle and sheep in MiRBase21 (http://www.MiRbase.org/) to obtain known microR-NAs. We analysed whether there were significant differences in the expression of known microRNAs (P = 0.05) and compared microRNA expression patterns using a log 2-ratio and scatter plot. Association analysis. Target gene prediction for the initiation of related microRNAs. Targetscan (parameter: context score percentile of ≥90) and MiRanda (parameter: max energy of ≤−20) software were used to predict the differentially expressed microRNA target genes in the telogen and anagen stages, and the target regulatory relationships between microRNAs and mRNAs were determined based on sequence complementarity, sequence conservatism, thermo-kinetic factors, site-binding ability and UTR base distribution. The relationships between microRNAs and mRNAs are complex. A microRNA can target multiple mRNAs simultaneously. In contrast, a target gene mRNA can be regulated by multiple microRNAs simultaneously. We used ACGT101-CORR (1.1 version) software to extract and perform statistical analysis for data describing target regulation relationships, specifically positive and negative relationships between the expression of microRNAs and their target genes expressed as the difference multiplier of microRNA and the difference multiplier of mRNA (R correlation analysis).
Analysis of genes related to chorionic cycle initiation. This study used GO enrichment and KEGG functions for gene analysis. The target functions of all genes or annotations among the differentially expressed miRNAs were counted, and hypergeometric tests were used to determine the functions or pathways that were significantly enriched among differentially expressed miRNA-mRNA relationships. The formula for calculating P value significance was ∑ = − . If the microRNA target gene of the functional annotation satisfied this condition, we defined the result as distinct, significant expression. In the formula, TB is the total number of mRNAs with functional annotations or path annotations, TS is the number of mRNAs corresponding to the differential expression of miRNA in the TB,  B is the number of mRNAs annotated for a particular single function or a particular path, and S is the number of mRNAs annotated for a specific single function or a specific miRNA in a particular path. According to the first 8 GO functions and KEGG pathways with the smallest P values, we sorted the results and performed an enrichment analysis. Among KEGG pathways, the pathway that is most closely related to hair follicle development is the Wnt signalling pathway. We separated microRNAs from their target genes in the Wnt pathway using the biogrid database and downloaded the network interaction, microRNAs, target genes, Wnt genes and signalling pathway interactions, which were distinguished by various colours, characteristics and attributes such as size, using Cytoscape software to obtain the functions in the Wnt pathway network analysis chart.
Analysis of main genes and the expression of microRNAs acting on the chorionic cycle. Real-time fluorescent quantitative PCR was performed using the FAST SYBR Green Master Mix Kit (Applied Biosystems, USA). Fluorescent quantitative primers were designed with primer5 (Table 4). PCR reactions were performed using an Agilent 3000XP Real Time PCR amplification instrument; the PCR reaction program was 95 °C for 20 sec, 1 cycle; 95 °C for 5 sec; 60 °C for 30 sec, 40 cycles; 95 °C for 1 min, 55 °C for 30 sec; and 95 °C for 30 sec, 1 cycle. Each sample was tested 3 times, and 3 blank controls were used for each primer. We used the 2 −ΔΔCt method for relative quantitation between samples, and the reference gene was β-actin. Using the statistical analysis software SPSS 17.0, we tested differences in the relative expression of keratin-associated protein genes in Arbas cashmere goat skin in different months using two methods: LSD and Duncan's test. The results were expressed as the average value ± standard deviation.