Transcriptome Dynamics During Turbot Spermatogenesis Predicting the Potential Key Genes Regulating Male Germ Cell Proliferation and Maturation

Spermatogenesis is a dynamic developmental process in which spermatogonial stem cells proliferate, differentiate and mature into functional spermatozoa. These processes require an accurate gene regulation network. Here, we investigated the dynamic changes that occur during spermatogenesis through a combination of histological and transcriptome analyses of different developmental stages of the testis. We constructed 18 testis transcriptome libraries, and the average length, N50, and GC content of the unigenes were 1,795 bp; 3,240 bp and 49.25%, respectively. Differentially expressed genes (DEGs) that were related to germ cell proliferation and maturation, such as NANOS3, RARs, KIFs, steroid hormone synthesis-related genes and receptor genes, were identified between pairs of testis at different developmental stages. Gene ontology annotation and pathway analyses were conducted on DEGs with specific expression patterns involved in the regulation of spermatogenesis. Nine important pathways such as steroid hormone biosynthesis related to spermatogenesis were identified. A total of 21 modules that ranged from 49 to 7,448 genes were designed by a weighted gene co-expression network analysis. Furthermore, a total of 83 candidate miRNA were identified by computational methods. Our study provides the first transcriptomic evidence for differences in gene expression between different developmental stages of spermatogenesis in turbot (Scophthalmus maximus).


Results
Histological analysis of turbot male germ cells during spermatogenesis. A haematoxylin and eosin (HE) stain analysis was conducted to understand the proliferation and maturation processes of male germ cells during testis development. Testis tissues were collected at the following different reproductive stages: MSII, MSIII, MSIII-MSIV, MSIV, MSV and MSVI. Morphological observations of the testes and male germ cell structural features are shown in Fig. 1. Spermatogenesis produced cysts and the germ cells of each cyst were in synchronous development. The type and number of germ cells per cyst changed from the MSII to MSVI stages. The histological developmental stage was determined according to the characteristics and ratios of male germ cells. Male germ cell proliferation mainly occurred at MSII and MSIII stages. At the MSII stage, the germ cells included spermatogonium (26.10%) and primary spermatocyte (48.95%) cells. The MSIII stage was the initiation stage of the gonad annual breeding cycle, during which the germ cells demonstrated a higher proliferative activity, including mitosis and meiosis. At this stage, the germ cells were spermatogonium (15.06%), primary spermatocytes (53.87%), secondary spermatocytes (15.03%) and other spermatids (15.87%). At the MSIV stage, most spermatocytes had developed into spermatids, and some of the spermatids had developed into spermatozoa, which then matured and hydrated at the MSV stage. The MSVI stage was the testis recession stage where the germ cell types were mainly primary spermatocyte (63.10%). At the MSVI stage, the ratios of different germ cell types were similar to each other but different compared to the MSIII stage (Table 1). Sequence analysis and de novo assembly. In this study, 18 cDNA libraries (MSII, MSIII, MSIII-IV, MSIV, MSV and MSVI) were constructed using total RNA from different development stages during spermatogenesis. There were three biological replicates at each stage. To ensure that the RNA-seq data satisfied the criteria for the transcriptome analyses, we conducted standard analyses for quality control. We generated approximately 120.7 Gb bases in total after Illumina HiSeq sequencing. We then produced 132,416 unigenes after assembling all of the samples together. The total length, average length, N50, and GC content of the unigenes were 237,774,222 bp; 1,795 bp; 3,240 bp and 49.25%, respectively. Unigenes were aligned to seven functional databases to obtain possible functional information, and 85,260 (NR database: 64.39%), 103,001(NT: 77.79%), 76,504 (Swissprot: 57.78%), 37,443 (COG: 28.28%), 74,679 (KEGG: 56.40%), 7,804 (GO: 5.89%) and 60,376 (Interpro: 45.60%) unigenes were annotated.
DEGs at different stages. The differential expression of genes at different stages was analysed. The greatest number of DEGs were identified between the MSII and MSIV stages. A total of 6,952 genes were up-regulated and 2,120 genes were down-regulated from MSII to MSIV. Second, in comparisons between MSII vs. MSIII, MSII vs. MSV, MSIV vs. MSVI, MSV vs. MSVI, MSIII vs. MSV and MSIII vs. MSVI, the number of DEGs ranged from 1,885 to 3,420. The number of DEGs was lowest at MSII vs. MSVI, MSIII vs. MSIV and MSIV vs. MSV, with values ranging from 300 to 721 DEGs. A total of 1,222 genes were up-regulated and 959 genes were down-regulated from MSIII to MSV, 141 genes were up-regulated and 159 genes were down-regulated at MSV compared to MSIV, and 914 genes were up-regulated and 1,829 genes were down-regulated from MSV to MSVI (Fig. 2a). A greater number of DEGs were detected between the proliferation and maturation stages than between adjacent development stages. The screened key DEGs between the development stages that were identified in the transcriptome database are listed in Table 2 [log2 (fold) ≥4, P ≤ 0.05]. We identified 1,823 genes that were specifically expressed at the MSII stage, 3,155 at the MSIII stage, 3,970 at the MSIV stage, 2,088 at the MSV stage and 2,376 at the MSVI stage (Fig. 2b).
During the process of the testis development cycle, at the MSIII stage, the onset of the spermatogenesis stage, germ cells multiplied greatly. The expression of kinesin and myosin were up-regulated significantly, which regulated the activity of microtubules. Steroid metabolism-related genes such as cytochrome P450 20A1 (CYP20A1), thyroid hormone receptor-associated protein 3 (THRAP3), oestrogen receptor beta-a (ERβ-a) and 17-beta-hydroxysteroid dehydrogenase 14 (HSD17B14) were up-regulated significantly in the MSIII stage vs the MSII stage, and played an important role in the onset of spermatogenesis. In addition, the insulin-like growth factor 1 receptor (IGF1R), insulin receptor substrate 2-B-like (IRS2B), matrix metalloproteinase-15 (MMP15), spermatogenesis-associated protein (SPATA), autophagy-related protein (ATG) and pannexin 1 (PANX1) genes were also present at a significantly high expression level in MSIII, suggesting that they may be involved in male germ cell proliferation at spermatogenesis initiation of the annual breeding cycle. At MSV, the sperm mature stage, steroid hormone-related genes and pathways were up-regulated significantly. The STAR, testosterone 17-beta-dehydrogenase 3 (HSD17B3), steroid 21-hydroxylase (CYP21A2), luteinizing hormone receptor (LHR), follicle stimulating hormone receptor II (FSHRII), stimulated by the retinoic acid gene 6 protein (STRA6), prolactin receptor (PRLR), steroidogenic factor 1(NR5A1) and oestrogen receptor beta genes (ERβ), among others, showed obvious high expression levels at the MSV stage. The steroid hormone-related genes and pathways are mainly responsible for regulating the last stage of sperm maturation (Table 2).
Functional annotation of DEGs. We performed a gene ontology (GO) classification and functional enrichment on DEGs of different stages. In total, 190 genes that were differentially expressed between the MSII and MSIII stages were assigned molecular functions (180 terms), followed by 174 DEGs that were assigned biological processes (621 terms) and 144 that were assigned cellular components (91 terms) (Fig. 3a). A total of 892 GO terms were assigned to DEGs from the MSII vs MSIII stage comparison. The GO analysis revealed transcripts related to specific processes such as cilium (GO:0005929), cilium organisation (GO:0044782) and Kupffer's vesicle development (GO:0070121) in MSII vs MSIII. Consistently, DEGs between MSII and MSIII were also enriched in male germ cell proliferation processes such as the regulation of cell division (GO:0051302), regulation of mitotic cell cycle (GO:0007346), regulation of cell proliferation (GO:0042127), germ cell development (GO:0007281), spermatogenesis (GO:0007283) and male gamete generation (GO:0048232). Furthermore, a total of 146 DEGs between MSIII and MSV were assigned molecular functions (189 terms) followed by 146 DEGs assigned biological processes (581 terms) and 133 assigned cellular components (95 terms) (Fig. 3b). In total, 865 GO terms were identified for genes that were differentially expressed between the MSIII and MSV stages. GO analysis between MSIII and MSV uncovered genes related to the cytoskeleton such as microtubule cytoskeleton (GO:0015630), microtubule motor activity (GO:0003777), Kupffer's vesicle development (GO:0070121), cilium assembly (GO:0042384) and cilium morphogenesis (GO:0060271) genes. GO terms related to steroid metabolism such as steroid hydroxylase activity (GO:0008395), 17-alpha-hydroxyprogesterone aldolase activity (GO:0047442), steroid hormone receptor activity (GO:0003707), progesterone metabolic process (GO:0042448), steroid metabolic process (GO:0008202), steroid hormone mediated signalling pathway (GO:0043401) and cellular response to steroid hormone stimulus (GO:0071383) play important roles in male cell maturation (Supplementary Table S1). The pathway analysis with the KEGG pathway mapping database provided information on common pathways involved in spermatogenesis. In this study, nine important pathways (steroid hormone biosynthesis, ovarian steroidogenesis, prolactin, TGF-beta, Wnt, GnRH, mTOR, oestrogen and p53 signalling pathways) related to male germ cell proliferation and maturation were identified (Table 3). GO and KEGG annotations were helpful for identifying potential genes related to developmental processes.

Validation of DEGs with quantitative real-time PCR (qRT-PCR).
To evaluate our DEG library, the expression levels of six DEGs, which were primarily involved in spermatogenesis, were analysed by qRT-PCR. Fig. 4 shows the qRT-PCR results, which reflect the same expression trends that were observed through the RNA-seq analysis.   Table 1. Proportion of germ cells in the testis of male turbot collected over an annual cycle. Note: The parameters tested were determined using a one-way ANOVA and identified using the Student-Newman-Keuls (SNK) test (P < 0.05). The values are presented as mean ± SD. According to SNK analysis, the letter "a" represents the germ cells proportion whose mean values are relatively larger. From "a" to "d", the mean values decrease in turn. There is no significant difference with the same letter between different germ cell types in the same stage. There is a significant difference with different letters, (P < 0.05) between different germ cell types in the same stage.  The expression changes in the selected genes (CYP20A1, LHR, KIF15, STAR, BECN1 and MNS1) along the reproductive cycle were significantly variable (ANOVA, P < 0.05) depending on the reproductive phase. From MSII to MSIII, CYP20A1, KIF15 and MNS1 were significantly up-regulated, while LHR, STAR and BECN1 were significantly up-regulated from MSIII to MSV. These qPCR results showed a similar temporal expression pattern to the one observed following an RNA-seq analysis during the different reproductive phases.

Co-expression network analysis of DEGs.
A weighted gene co-expression network analysis (WGCNA) with scale-free topology was applied to show co-expressed genes across the whole reproductive cycle to systematically and globally identify highly connected gene subnetworks or modules. Groups of genes exhibiting very similar patterns for each module were then detected using hierarchical clustering based on the topological overlap calculations.
The expression values of the identified DEGs between different stages were considered, and a total of 21 modules were designed by using different colours in the WGCNA network. Each module had a size ranging from 49 (thistle2 module) to 7,448 genes (red module) (Fig. 5a,b). Furthermore, WGCNA was also used to identify highly correlated genes within these modules. The genes within the modules exhibited more topological overlap than the genes across the modules in the topological overlap heatmap. Modules were individually analysed for the enrichment of GO and KEGG pathways. The terms were screened by extracting those containing the reproduction-related DEGs with a fold change ≥4 and padj ≤0.05 from the modules, which was conducted to provide a biological interpretation of the   constructed gene networks. For example, CYP26A1 from brown module while ER-β, IGF1, TRIP10 and MAPK4 of endocrine system from black module were identified in the gene network visualisation (Fig. 5c,d).

Identification of the genes involved in spermatogenesis and spermiogenesis. The previously
reported genes related to male germ cell proliferation and maturation identified in the present study are listed in Supplementary Tables S2 and S3. Members of the gene families AQP (4 isoforms), SOX (16 isoforms), and BMP (15 isoforms) and SPATA (9 isoforms) were widespread and present at high expression levels (Supplementary  Table S2). In addition, androgen receptor (AR), follicle stimulating hormone receptor (FSHR), gonadal soma derived factor (GSDF), growth differentiation factor 9 (GDF9), meiotic nuclear division protein 1 (MND1), anti-Mullerian hormone (AMH), DMRT, sperm flagellar protein 2 (SPEF2), synaptonemal complex protein 3 (SYCP3), membrane progesterone receptor alpha (PAQR7), progesterone receptor (PGR), 20beta-hydroxysteroid dehydrogenase (HSD20B) and melatonin (MLT) have been previously reported and were also identified in the turbot testis transcriptome (Supplementary Table S3). The genes listed above were all identified in the turbot transcriptome database, though they did not present significantly different expression changes between the different stages, which suggests that they may play multiple roles in regulating the whole spermatogenesis process.
Turbot miRNA screening from RNA-seq data. A total of 83 candidate miRNAs were identified by computational methods. The sequences and structural properties of known miRNAs were used to screen the candidate miRNAs in the turbot transcriptome databases. The length range of the predicted pm-miRNAs was 18-24 bp. The number of sequences with 22 bp (51.81%) was significantly higher than others ( Supplementary  Fig. S1). The identified miRNAs are listed in Supplementary Tables S4. Spermatogenesis-related miRNAs, such as miR-34, miR-let-7, miR-29, miR-135, miR-449, miR-203 and so on were identified in the sm-miRNA database (Supplementary Table S4).   Table 3. The number of all genes and DEGs annotated in the pathways. The nine pathways are related to turbot spermatogenesis. and MNS1 during different developmental stages were detected by qRT-PCR. UBQ and RSP were used as reference genes for normalisation of qRT-PCR data. Bars represent the standard deviation (SD). The x-axis indicates the developmental stage. The y-axis shows the relative expression level of genes. Note: The parameters tested were determined using one-way ANOVA and identified using the SNK test (P < 0.05). According to SNK analysis, the letter "a" represents the relative expression levels whose mean values are relatively larger. From "a" to "d", the mean values decrease in turn. There is no significant difference between stages with the same letter. There is a significant difference between stages with different letters, (P < 0.05).

Discussion
In this study, we identified some key genes and pathways regulating spermatogenesis during the reproductive cycle of the testis from different development stages in turbot using histological and RNA-seq analyses. To our knowledge, the present study is the first to investigate spermatogenesis combining male germ cell development with transcriptome in marine fish species. Previous reports related to the process of spermatogenesis in fish mainly focused on morphology descriptions, physiological studies, and the quantity and location of specific genes [11][12][13] . The large-scale identification of functional genes regulating testis development and spermatogenesis is limited, although omics have been used to investigate biological characteristics such as growth, immunity, sexual dimorphism and sex determination [8][9][10] . In the present study, we obtained many DEGs between different typical developmental stages. The differences in gene expression between MSII and MSIII, indicate that dramatic changes occurred from spermatogonium proliferation to spermatocytes meiosis. Another significant difference appeared between the MSIII and MSV stages, which indicates that significant changes occurred in the gene network for germ cell development from spermatocytes meiosis to sperm maturation. At the MSII stage, NANOS3 (sm-NANOS3) was predominantly expressed, with spermatogonium self-renewal and differentiation noted through histological observation. Similar results were obtained in mammals and other fish species 14,15 . For example, in mice, NANOS3 was expressed in most undifferentiated spermatogonia (As to Aal) 14 , which is responsible for maintaining the undifferentiated state of spermatogonia by the control of their cell cycle 16 . In some teleost, such as common carp (Cyprinus carpio) 17 , half-smooth tongue sole (Cynoglossus semilaevis) 15 and zebrafish (Dania rerio) 18 , NANOS3 was also expressed in germline stem cells and maintained the stem characteristics. NANOS3 might play an important role in regulating the balance of the proliferation and differentiation of spermatogonium cells. (d) Visualisation of the candidate hub genes from the black module. Node colour denotes differential expression levels: blue represents down-regulation, red represents upregulation and orange represents up and down-regulation both presented. Node size represents the importance of a node. The edge denotes the interaction strength.
It is well known that RA, which is regulated by the RA synthesis enzymes ALDH and degradative enzymes CYP26, binds to its receptor to regulate downstream gene expression or acts directly on the effect of gene STRA8, regulating the initiation of meiosis during sex differentiation and spermatogenesis in mammals such as mice [19][20][21] and some teleost species, such as zebrafish 22 , Nile tilapia (Oreochromis niloticus) 23 , and southern catfish (Silurus meridionalis) 24 . In turbot, some genes related to RA metabolism and its receptor were significantly up-regulated at the MSIV stage compared with the MSII stage. Meanwhile, by constructing WGCNA gene network for DEGs, CYP26A1, one of the degradative cytochrome P450 enzymes, was identified as a candidate brown module hub gene in the gene network visualisation produced by WGCNA analysis. Therefore, we speculated that RA metabolism plays an important role in the regulation of meiosis in turbot by regulating spermatogonial cells during meiosis, including the differentiation of cells to secondary spermatocytes, processes that are observed from stage II to IV. An RA signal pathway is required for germ cell meiotic initiation during spermatogenesis.
In vertebrates, sperm development and maturation are directly regulated by gonadal steroid hormone secretion. The relationship between the expression of genes encoding steroidogenic proteins and receptors for gonadotropins and testicular steroid production have not yet been comprehensively determined in male teleost 25 . In turbot, sperm maturation occurred at the MSV stage, and steroidogenesis enzymes and steroid hormone receptors such as STAR and LHR both showed obvious high expression levels. Furthermore, members of the ovarian steroidogenesis pathways, such as the ER were also significantly up-regulated in turbot at the MSV stage. Oestrogen (E2) has been proven to play an important role in mammals 26 , insects 27 , and fish. Oestrogen significantly influenced the maturation and differentiation of spermatocytes into spermatids 26,28 and significantly enhanced the sperm fertilising capacity in vitro, inducing the acrosome reaction and sperm capacitation 29 in mammals. For teleost, such as the European eel (Anguilla Anguilla) 13 , githead seabream (Sparus aurata) 30 , and rainbow darter (Etheostoma caeruleum) 31 , ER have been reported to be involved in the early stages of spermatogenic cell development. The significant up-regulation of the ER and DEG enrichment in the oestrogen pathway at the MSV stage indicated the high importance of E2 in sperm maturation and hydration in turbot. Furthermore, ER-β, IGF1, TRIP10 and MAPK4 of endocrine system were identified in the gene network visualisation from black module. They may experience interaction during sperm maturation.
Kinesins have been reported to be involved in acrosomal biogenesis in the rat 32 , Chinese mitten crab (Eriocheir sinensis) 33 , and lobster (Procambarus clarkia) 34 . In the present study, members of the kinesin gene family, such as KIF15, KIF16B, KIF20B and KIF22, showed significant differences in expression patterns. The kinesins family in turbot might participate in the dynamic cytoskeleton rearrangements of male germ cells, including mitotic and meiotic division, essential organelle transport and the biogenesis of peculiar structures, although no acrosome biogenesis in teleost sperm was observed during male germ cell proliferation, sperm maturation and fertilisation.
In addition, in this study, a total of 83 candidate sm-miRNAs, including miR-34, miR-29, miR-202 and miR-let-7, were identified by bioinformatics combining with four screen criteria. Increasing evidence demonstrates that miRNAs play an important role in various biological processes by negatively regulating target mRNA. In mammals, miR-34 is required by developing gametes and embryos and may be a potential biomarker of sperm quality, miR-202 35 maintains spermatogonial stem cells by inhibiting cell cycle regulators and RNA binding proteins and miR-29 and miR-let-7 36 promote SSC differentiation and the meiosis process. Because of their prominent functions, hundreds of miRNAs have been identified in recent years, while only a very small number of miRNAs involved in the spermatogenesis process in teleost have been identified. Therefore, the stage-specific requirements and regulatory mechanism for particular miRNAs in spermatogenesis remain largely uncharacterised for turbot.
In conclusion, transcriptome dynamics during spermatogenesis presented in the present study provide important molecular resources to improve our understanding of the testis development process of turbot. The identification of candidate genes such as NANOS3, RARs, KIFs, steroid hormone synthesis-related genes and receptor genes, related to male germ cells proliferation and maturation of turbot might contribute to find out that their regulatory roles in testis development. In addition, further study of these predicted genes could provide useful insights into turbot male germ cells development.

Methods
Experimental fish, and sample collection. The study described in this article was conducted following the European Union (EU) Directive 2010/63/EU for animal experiments (http://ec.europa.eu/environment/chemicals/labanimals/legislation_en.html) and was approved by the ethical committee of the Institute of Oceanology, Chinese Academy of Sciences.
Turbot were reared indoors in circular tanks supplied with flow-through seawater with an ambient temperature of 18 °C ± 0.5 °C and a dissolved oxygen level of ≥5 mg/ml at Shenghang Sci-Tech Co., Ltd. (Weihai, Shangdong Province, China). A photoperiod management regime in accordance with the 8L: 16D, 12L: 12D and 16L: 8D sequential order was used to control gonad development. Three samples were obtained at the same time once in two weeks.
Turbot samples were measured for total weight (±0.001 g), length (±1 mm), and gonad weight (±0.001 g). The gonadosomatic index was 100 (gonad weight/body weight). Gonads were divided in half with one lobe collected and processed for histology and one lobe collected for molecular studies and flash-frozen in liquid nitrogen. The samples were collected from March 2016 to July 2016 to obtain males undergoing distinct developmental stages of the reproductive cycle (Stage-Stage). Stage II testes were collected from 11-month-old males. Approximately 32 males aged between 11 months old (0.38-0.42 kg) and the breeding stage were selected. Testes were excised and examined by anatomical and histological methods to determine their developmental stages. Light microscope observation and statistical analysis. A testis lobe from each fish was fixed in Bouin's solution for 24 h, dehydrated in a series of ethanol, clarified in xylene and embedded in paraffin. The tissues were embedded in paraffin and sectioned with a microtome (Leica RM 2155) at 3 μm thicknesses. All tissue sections were stained with HE. Images were captured by a Nikon Ni-E microscope equipped with a Nikon DS-Ri2 imaging system. All sections from each male fish were examined in a random fashion. Each cell type (spermatogonia, spermatocyte, spermatid, and spermatozoa) at each section of the border and inside of the cyst was scored and five random images per fish were analysed.

Construction and sequencing of cDNA libraries.
Total RNA for each sample was extracted with integrity, and the size distributions of the RNA samples were verified using an Agilent 2200 Bioanalyer (Agilent Technologies, Germany). Samples with an RNA Integrity Number ≥8.0 were used for cDNA library preparation. The concentration of RNA in each extracted sample was measured. cDNA libraries were constructed for each RNA sample and 18 samples and 18 libraries were sequenced. During the QC step, an Agilent 2100 Bioanalyzer and ABI StepOnePlus Real-Time PCR System were used to quantify and qualify the sample library. Then, the library underwent paired-end sequencing using an Illumina HiSeq X-Ten. After sequencing, we received raw reads and filtered out any reads that had low-quality scores, were adaptor-polluted or had a high content of unknown bases (N) to produce clean reads. The specific parameters for raw reads filtering as follows: remove reads with adaptors; remove reads in which unknown bases(N) comprise >5%; remove low-quality reads (we define a low-quality read as the percentage of base which quality is lesser than 15 is greater than 20% in a read). After filtering, the remaining reads are termed 'clean reads' . Next, a de novo assembly was run with the clean reads to obtain unigenes.
qRT-PCR validation of the DEGs. Six genes (CYP20A1, LHR, KIF15, STAR, BECN1 and MNS1) were chosen for confirmation of DEG data by qRT-PCR. The experiments used SYBR Green (Takara, Dalian, China) detection in a CFX Connect Real-time System (Applied Biosystems, BIORAD, USA). The UBQ and RSP genes were used as reference genes in these experiments 45 . Relative gene expression data were analyzed using the 2 −ΔΔCt method. All reactions were carried out in triplicate.

WGCNA.
A gene co-expression network was constructed using the WGCNA method, which was implemented with the WGCNA package in R 46 .

Reproduction-related genes during spermatogenesis identified from transcriptome sequences.
The identification of homologous unigenes with spermatogenesis-related proteins was carried out by searching the key words of reported proteins in BLASTX alignment results with the databases mentioned above. The identification of key unigenes that regulate male germ cell proliferation and maturation during different development stages was based on the differential expression results.
Bioinformatics prediction of miRNAs from transcriptome sequences. Sequences used for sm-miRNA prediction were derived from a local BLAST search that compared the turbot transcriptome database with the miRNA hairpin database. The procedures for searching for candidate sm-miRNA were performed, as previously reported, with some modification (evalue = 1e-5) 47 .