The transcriptomic signature of different sexes in two protogynous hermaphrodites: Insights into the molecular network underlying sex phenotype in fish

Sex differentiation is a puzzling problem in fish due to the variety of reproductive systems and the flexibility of their sex determination mechanisms. The Sparidae, a teleost family, reflects this remarkable diversity of sexual mechanisms found in fish. Our aim was to capture the transcriptomic signature of different sexes in two protogynous hermaphrodite sparids, the common pandora Pagellus erythrinus and the red porgy Pagrus pagrus in order to shed light on the molecular network contributing to either the female or the male phenotype in these organisms. Through RNA sequencing, we investigated sex-specific differences in gene expression in both species’ brains and gonads. The analysis revealed common male and female specific genes/pathways between these protogynous fish. Whereas limited sex differences found in the brain indicate a sexually plastic tissue, in contrast, the great amount of sex-biased genes observed in gonads reflects the functional divergence of the transformed tissue to either its male or female character. Α common “crew” of well-known molecular players is acting to preserve either sex identity of the gonad in these fish. Lastly, this study lays the ground for a deeper understanding of the complex process of sex differentiation in two species with an evolutionary significant reproductive system.


Results
Pre-processing treatment of reads and assembly construction. Illumina HiSeq. 2500 sequencing of red porgy and common pandora samples yielded 685,707,304 (342,853,652 read pairs) and 644,189,592 (322,094,796 read pairs) reads, respectively. The raw sequence data have been submitted to the National Centre for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database (BioProject ID: PRJNA395994).
For red porgy, pre-processing of reads resulted in 247,712,854 high-quality reads used for the initial transcriptome assembly, which consisted of 295,367 putative transcripts (N50: 1,961 nucleotides, mean length: 928 nucleotides) and showed significant similarity to 23,122 out of 26,754 European sea bass genes. After transcript quality assessment, our initial dataset was limited to 98,012 contigs (N50: 2,978; mean length: 1,587 nucleotides) corresponding to 67,132 genes and was significantly similar to 20,130 European sea bass genes. Filtering resulted in elimination of 197,355 possibly spurious transcripts and this restricted dataset constituted the final transcriptome with only slightly reduced numbers of hits to the unique genes of European sea bass.
For common pandora, filtering of reads resulted in 232,768,947 high-quality reads that were used to construct an initial assembly of 396,201 putative transcripts (N50: 1,529 nucleotides, mean length: 824 nucleotides). Additional editing excluded 254,892 possibly spurious transcripts, thus limiting our dataset to 141,309 contigs (N50: 2,450; mean length: 1,254 nucleotides), corresponding to 98,211 genes. On this occasion, the unfiltered transcript dataset showed significant similarity to 23,631 European sea bass genes out of 26,754, while the filtered assembly showed significant similarity to 21,028 genes. Following the same reasoning, the restricted dataset constituted the final transcriptome (Fig. 1).
To evaluate the accuracy of the assembled sequences (transcripts), all the usable sequencing reads were aligned onto the transcripts using Bowtie2 51 . A high percentage of red porgy and common pandora reads (84% and 81%, respectively) were successfully back-mapped on each assembly. Similarities between assemblies and UniProtKB/ SwissProt database disclosed 21,217 and 21,251 contigs for red porgy and common pandora, respectively, with coverage higher than 80%. In addition, BUSCO run (Supplementary were nearly complete with 86% and 82% complete matches of vertebrate orthologs, found in red porgy and common pandora assemblies, respectively. Transcriptome annotation. The blastx search in red porgy and common pandora assembled transcriptomes (e -value cut-off <10 −5 ) returned 43,123 and 50,106 contigs, with at least one blast hit against Uniprot 52 and 38,354 and 44,392 contigs with a protein domain match against InterProScan 53 search, respectively (Supplementary Table ST2). To identify the possible functions, Gene Ontology (GO) assignments were used to classify the sequences. Total numbers of 38,265 red porgy contigs and 51,026 common pandora contigs with at least one GO term were found (Blast2Go 54 annotated) (Supplementary Table ST2). The sequences from both species were categorized to thirteen molecular function (MF), 22 biological process (BP) and thirteen cellular component (CC) gene categories in GO level 2 (general function categories) (Supplementary Figures SF1-4). To assess the functional diversity of the assembled transcriptome, GO annotations of red porgy were compared with those of common pandora transcriptome, reflecting a similar functional distribution on GO categories and similar sequence diversity of the two transcriptomes. Profiling of both species (whole transcriptome) was generally very much alike, as shown in the comparative view produced by WEGO 55 (Fig. 2).
The potential enzymes were characterized based on the chemical reaction they catalyze, using the prediction of Enzyme Commission (EC) numbers for each sequence. For red porgy, among the 38,265 blast2go-annotated contigs, 9,572 contigs were classified to 1,194 different EC numbers, while among the 51,026 common pandora blast2go-annotated contigs, 12,821 were classified to 1,243 different EC numbers (Supplementary Table ST2).

Global expression pattern of brains and gonads. The principal component analysis (PCA) plots
obtained for the global gene expression pattern observed in brain and gonad tissues of the nine red porgy and ten common pandora individuals showed the formation of three separate clusters for both species: the brains, the male gonads (testes) and the female gonads (ovaries) (Fig. 3). While testes and ovaries showed a clear separation with a remarkable variation within the two groups, male or female brains from both species were clustered together with very similar expression patterns.
Tissue-specific gene ontology profiling for both fish revealed differences in GO terms representation between brains and gonads, in all three categories in GO level 2. When comparing brains and gonads, except from altered percentages of the same main ontology categories (shared between brains and gonads), there were terms such as synapse (CC), cell junction (CC), behavior (BP), rhythmic process (BP), cell aggregation (BP), cell killing (BP), chemoattractant activity (MF) and chemorepellent activity (MF) that were present only in the brains of the two sparids, but were totally absent or present in low numbers in their gonads. This would indicate brain specificity of the underlying genes ( Supplementary Figures SF1-4).
For a better understanding of the biological processes and molecular functions underlying the nature of a male or female phenotype, GO term enrichment analysis was carried out in order to detect differentially-expressed (DE) genes over-represented in the different sexes of both species. Here, the term 'enrichment' stands for the comparison between the DE genes of the female (test-set) against the male (reference-set) either for brains or gonads. The same picture of sample clustering in both species was also shown through heatmap plots (Fig. 4). Comparisons between males and females were conducted separately for the brain and gonad samples to produce PCA plots and heatmaps (Supplementary Figures SF5 and SF6). This closer view again showed that sex-dependent expression differences are greater by far in gonads than in brains. In the brain PCA plots of both species, males Figure 1. The pipeline followed to build assemblies and differential expression profiles in brains and gonads of two protogynous sparids. Flow chart of the basic steps implemented from raw reads for the selection of final assembled loci and differentially expressed genes in brains and gonads of both species. and females clustered together, with red porgy brain samples clustered more tightly than those of the common pandora that show a looser clustering.
Comparing gene expression in male and female brains. Out of 67,132 red porgy and 98,211 common pandora total loci, 43,330 and 61,048 genes had at least 15 reads in all compared samples (the sum of counts in Figure 2. Comparative view of GO annotation. Comparison of Gene Ontology. (GO) categories (assigned by WEGO) between red porgy and common pandora. GO terms for the genes of red porgy (dark purple) and common pandora (pink) were categorized into cellular components, molecular functions and biological processes. each row of the count matrix >15) and were used for differential expression analysis. In general, brain DE genes showed low counts and low fold-change differences (Supplementary Figure SF7). For red porgy brain samples, 553 genes differed significantly with 365 of them having a male-biased (over-expression) and 188 of them having female-biased expression. Common pandora brain samples showed higher numbers of DE genes and an opposite direction of expression than red porgy brain samples. In total, 4,910 genes showed differential expression between the sexes, of which 1,733 were up-regulated in the male brains and 3,177 of them were up-regulated in the female brains (Fig. 5).
A GO analysis was conducted for all three general function categories (BP, MF, CC). Focusing on biological processes, brain enrichment analysis resulted in 241 GOs over-represented in female (light red-colored in Supplementary Excel Table SET1) and 265 in male common pandora (light blue-colored in Supplementary Excel Table SET1). For red porgy, 58 GOs were over-represented in females (light red-colored in Supplementary Excel Table SET2) and 113 in males (light blue-colored in Supplementary Excel Table SET2).
Comparing gene expression in male and female gonads. In gonad samples, we applied the same filter and kept only loci with a sum of expected counts in each row of the count matrix >15. Specifically, 43,692  and 52,126 genes fulfilled this criterion and were examined for differential expression in red porgy and common pandora gonads, respectively. The analysis revealed striking differentiation between testes and ovaries. In total, 18,724 red porgy gonadal genes showed a significant differential expression, of which 15,989 were up-regulated in the testes and 9,951 of them were up-regulated in the ovaries. In the case of common pandora, 25,928 gonadal genes had significantly differential expression, 16,148 them with up-regulation in their testes and 9,780 them with up-regulation in their ovaries (Fig. 5).
Focusing again on biological processes, Enrichment Analysis of DE genes in common pandora gonad tissues resulted in 236 GO terms significantly enriched in testes (light blue-colored in Supplementary Table SET3) while 517 GO terms were found in ovaries (light red-colored in Supplementary Table SET3); likewise, in red porgy 284 GO terms were significantly enriched in testes (light blue-colored in Supplementary Table SET4) and ovaries enriched with 370 GO terms (light red-colored in Supplementary Table SET4).

Sex-specific candidate genes in gonads.
To have a closer view in the molecular playground causing sex dimorphism in these sparids, we checked the presence and the expression patterns in their gonads, of genes that have been reported as having an important role or participating in sex determination/differentiation or in sex maintenance in other fish or vertebrates. We selected 114 candidate genes in total. Out of this list, 84 genes were present in the transcriptomes of both species, while Srd5a2 was present only in red porgy transcriptome and Dmrt3a was found only in common pandora. Table 1 includes all the genes with significantly differential expression in at least one of the studied species (excluding the not significantly different -NSD-characterized-genes in both species) and compares their expression pattern between them.
Targeted search of candidates genes for sex determination/differentiation. To identify genes that play a role in reproduction and more specifically in sex determination and differentiation of fish, we searched the annotated brain and gonad datasets for these GO categories and checked if they were differentially expressed. In particular, DE genes related to broad terms of biological processes, such as "reproduction" and "reproductive process" were searched in gonads and brains of both sparids. A large number of genes related to "reproduction" and a few to "reproductive process" were found in the gonads of both species. In the common pandora, 492 female-biased and 717 male-biased genes were found under the "reproduction" GO term, while thirteen female-biased and eleven male-biased genes were found under the "reproductive process" GO term. In the red porgy, 467 female-biased and 430 male-biased genes belonging to "reproduction" and nine female-biased and twelve male-biased genes belonging to "reproductive process" were detected. The same search in brains retrieved 246 "reproduction" related genes expressed in the common pandora, of which 153 were male-biased and 93 female-biased. Only four male-biased "reproductive process" related genes were found to be significantly expressed in common pandora brains. In the red porgy, 22 male-biased and just three female-biased "reproduction" related genes were found and no "reproductive process" related genes were among the DE genes. Further data mining, focusing on more specific GO terms, revealed nine loci in common pandora and five loci in red porgy containing the GO term "sex determination" and displayed significant differential expression in male and female gonads. A search for loci implicated in "sex differentiation" with differential expression in male and female gonads retrieved fourteen loci for common pandora, while 22 were found for red porgy. In brains, only one "sex determination" related and two "sex differentiation" related loci were found to be differentially expressed in the common pandora, while no differentially expressed loci for these GO terms were found in red porgy (Supplementary Excel  Tables SET5-7). Finally, several genes expressed in male brains characterized with the GO term "social behavior" enriched in the red porgy and "inter-male aggressive behavior" and "male courtship behavior" in common pandora (Supplementary Excel Table SET8).

Discussion
The present study investigated sex-specific gene-expression patterns in gonads and brains of two protogynous hermaphrodites belonging to the Sparidae family, the common pandora and the red porgy. This is the first comparative transcriptomic analysis of two species displaying the same reproductive mode, focusing on genes that are differentially expressed between sexes to pinpoint a common network that forms and/or maintains the male and female phenotype. In both species, male and female brains showed far fewer differences in expression compared to the gonads. Male and female brains were clustered together, forming a dense cluster. Quite the opposite was observed for the gonads, which differ greatly between the two sexes. Testes and ovaries were well separated and there was great variance between the formed groups (Figs 3, 4 and Supplementary Figure SF5). Minor sex differences indicated a more homogeneous and sexually plastic brain. Our study revealed subtle differences in expression between the sexes in the brain of the two protogynous species. This seems to be a common pattern in tissue-based studies, where sex-biased expression tended to be highest in the gonads compared to other tissues 56 . A low degree of differential expression in the brain has also been reported in other fish studies 35,39,57 . Nevertheless, in the current study we examined the brain as a whole, whereas it has been recognized that sex-specific differences in fish are noticeable only when comparing certain regions of the brain 58 . Despite the overall limited differences observed for brains compared to the gonads, common pandora male and female brains demonstrated a greater divergence than those of the red porgy, with many more genes differentially expressed between the sexes. Moreover, the direction of expression is female-biased in the common pandora, in contrast with the male-biased pattern observed here in the red porgy, but also in the sparid sharpsnout seabream 35 and other fish examined 38,57 . However, these results were in line with the expression pattern observed in protogynous clownfish brain 39 .
A closer look into the specific genes that were up-regulated in male or female brains revealed numerous candidates with a well-known role in sex differentiation in other species. Zona pellucida (ZP) genes that showed a sex-biased expression in both brains and gonads were over-expressed in female brains. Zona pellucida proteins are glycoproteins of the fish chorion that are encoded by multiple gene families. In humans and mice, but also in some teleosts [59][60][61] , they are expressed in the ovary. In addition, several factors of the Sox gene family, as well as several members of the Forkhead box transcription factor (Fox) family were detected, which might play important roles in the brain of these protogynous fish. The comparative analysis revealed commonly over-expressed genes in the male brains of both species, pinpointing candidates that might make a significant contribution to the expression of the male character of the brain in protogynous sparids. Among these, the neurobeachin (Nbea) is required in synaptic function, regulating neurotransmitter receptor trafficking to synapses 62 . A male over-expression shared between species also displayed the PRKCE (Protein kinase C epsilon type) and EphA4 (Ephrin type-A receptor 4). The first is implicated in platelet activation and positive regulation of the mitogen-activated protein kinase (MAPK) cascade, while the second is a receptor tyrosine kinase implicated in the ephrin receptor signaling pathway and positive regulation of Jun kinase activity. The role of such kinases in mammalian sex determination has been reported 63 , while mice studies underline the role of the phosphorylated form of MAPK (pMAPK) as a marker of neuronal activity in a reproductive context 64 .
In the common pandora and red porgy brains we also found the early growth response factor 1 (Egr1), the early growth response factor 4 (Egr4), as well as Fos (Proto-oncogene c-Fos) and all had a male-biased expression profile. The Fos gene, which encodes a nuclear phosphoprotein, reflects immediate neural activity, whereas Egr1 is believed to regulate later-acting genes, such as gonadotropin-releasing hormone 1 (GnRH1), a primary signal essential for reproduction 65 . The Egr1 and c-Fos, both characterized as immediate-early genes, are related to changes in brain activity and genomic responses to social cues, behavior and mating information 66 .
Overall, in the brain of both protogynous species there were some sex-specific expression differences, although it is not known whether these differences mediate sexually dimorphic events. The more "homogeneous" transcriptome profile of the brain, initially perceived by the lack of clustering in male and female brain, became more apparent in GO term enrichment analysis, where GO terms related to female development and differentiation were over-represented in male brains and vice-versa. Sex differences in the brain, even in the human brain, are still poorly defined. As such, the issue of "brain sex differentiation" remains a controversial subject of research, especially in fish. Furthermore, the well-established phenomenon in mammals of fetal or neonatal brain exposure to sex hormones producing irreversible differences of brain structure and function, appears not to be the case in fish. Instead, fish brains are not irreversibly sexualized, conserving properties of the embryonic mammalian brain, such as high neurogenic activity, throughout life 67 . The notion that fish exhibit much more plasticity with regard to brain sex differentiation has been strengthened by the findings of the present study.
Major sex differences in gonadal gene expression patterns. The gonads is the tissue that differs the most between sexes 68 . Despite the lability of gonadal tissue in sex-changers and the fact that testes and ovaries share a common precursor, they clearly represent two distinct types of tissue upon the establishment of sex. The functional divergence of the two gonadal types is reflected in their transcriptomic profiles, in terms of the number of genes differentially expressed, as well as the expression magnitude (i.e. fold-change differences). The observation of almost double the number of up-regulated genes in males compared to females indicates a male-biased expression tendency. More and more transcriptomic studies dealing with sex-specific expression differences in teleosts have pointed out male-biased over-expression tendencies 16,35,38,39,57,69,70 . In zebrafish, a study exclusively addressed the "male sex drive" hypothesis, originally proposed by Singh and Kulathinal 71 , suggesting that malebiased evolutionary pressures might have resulted in a "transcriptome masculinization" of the species. The present study, with the two protogynous hermaphrodites, along with the previous one on rudimentary sharpsnout seabream 35 , agree with a skewed ratio of male up-regulated versus female up-regulated transcripts in gonads, in

Gene name
Gene description

Gene ID Expression
Common pandora Red porgy favor of the first. In cases such as the protogynous hermaphrodites studied here, with no signs of heterogametic chromosomes, such a tendency may explain a great deal about the formation and maintenance of the male and female phenotypes.

Expression profile of important molecular players in protogynous hermaphrodite gonads.
Studies on vertebrates and other fish species have brought to light several candidate genes with proven or suspected relationship in sex determination and differentiation. As there is scanty literature-based information regarding the sex-related genetic cascade in these protogynous species, we compiled a list of well-studied sex-biased genes and searched for their expression patterns in the gonadal transcriptomes of both sparids. Despite the plethora of different sexual systems in fish, there is some evidence of conservation of the genetic cascade, with its "basic" components (molecular players) found repeatedly, not only among fishes but also generally in vertebrates 72,73 . However, the specific role and place for each component of this at least partially conserved "crew" is not prominent and might change in different species and/or sexual systems 57,74,75 . Based on the literature, we profiled the expression of known candidate (Fig. 6) molecular-players/genes establishing the common female (Cyp19a1, Sox3, Foxl2, Figla, Wnt4a, Gdf9, Fst, Lhb, Ara, Arb, etc.) and male identity (Dmrt1, Sox9, Dnmt3aa, Rarb, Raraa, Hdac8, etc.) of the gonads of these protogynous sparids. Additionally, we focused on those contributing in species-specific manner either to female (Gata-4, Sox10, Amhr2, Gsdf, Srd5a3 etc.) or to male (Amh, Dmrt3a, Srd5a1, Srd5a2, etc.) characters. Starting with the formation and maintenance of the female phenotype, some genes over-expressed, showed high Log 2 fold changes in the ovaries of both sparids. The Figla gene that appears to have a decisive role in the female phenotype of both species is such an example. This transcription factor up-regulates ZP genes, which code for products used to build the vitelline envelope surrounding eggs in teleosts (zona matrix) [76][77][78] . Moreover, it has been suggested that the Figla factor displays an anti-parallel direction of expression to Dmrt1 during the transition from male to female in a protandrous 79 and vice versa in a protogynous teleosts 80 . The same expression pattern was showed for Foxl2, a prominent marker of ovarian development across vertebrates 81 . The Foxl2 has been reported as a female-biased gene in several gonochoristic [82][83][84][85] and, recently, in some hermaphroditic species 38,39 .
The gene of ovarian aromatase is probably the one with the most protagonistic role in gonadal fate. Together with estrogens it has been hypothesized to control not only the ovarian, but also the testicular differentiation. When it is over-expressed it appears to trigger and maintain ovarian differentiation and once its expression reduces it appears to allow for testicular differentiation to take place 86 . Here, a strong female-biased expression of gonadal aromatase gene (Cyp19a1a) was observed in both species. In fish, Cyp19a1a has an essential role in ovarian differentiation. Its expression was not only reported in gonochorists 75,[87][88][89] , but also in other hermaphrodite fish 39,90-94 which repeatedly highlights its involvement with the gonadal sex differentiation and sex change. It has been also suggested that the expression of this conserved and well-characterized protein is controlled not only by a "classical" transcriptional activator such as Foxl2, but also by epigenetic modifications 95 . Gonadal aromatase gene and its regulation through methylation modifications has also been involved in thermo-sensitive sex differentiation systems in fish 92,96 . Thus, changes in methylation levels of the Cyp19a1a promoter might be an additional control factor in sex-changing species 95 . The precise role of aromatase in the sex determination/ differentiation systems of these species awaits further and deeper studies. The Cyp19a1a promoter contains binding sites for Wilms tumor 1 protein (Wt1) 86 . In mammals, there is only one Wt1 gene critical for development of the urogenital system, whilst in teleosts two Wt1 (Wt1a, Wt1b) genes have been identified and both are over-expressed in ovaries 97 .
The Luteinizing hormone (Lh) and its receptor (Lhr) showed a strong female over-expression in both species examined here. According to studies in medaka, both genes are involved in fish ovulation 98 . A common female-biased expression also displayed the fushi tarazu factor-1 (Ftz-F1) or steroidogenic factor-1 (Sf-1). Interestingly, the opposite has been described in two other hermaphrodites 38,39 and in some well-known gonochorists 99,100 , where Sf-1 gene has been reported as a male-biased gene. It has been hypothesized that Sf-1/Ftz-F1 along with some additional partners, such as Foxl2 and cAMP, co-regulate the transcription of aromatase, as the later contains binding sites for Sf-1 genes 86 . Two members of the Wnt/b-catenin pathway, Wnt4 (wingless-type MMTV integration site family, member 4) and Ctnnb1 (Catenin, beta 1) and Fst (follistatin) well recognized markers of ovarian differentiation linked to this pathway, all seem to contribute to the female identity of the gonads in both sparids studied here.
Three factors of the retinoic acid (RA) signaling pathway were found in the transcriptome of both species and of those three factors Cyp26a1 over-expressed in their ovaries of both sparids. The role of this pathway in sex determination/differentiation and particularly in ovarian differentiation [101][102][103] is further supported by respective findings among ovary-enriched GO terms. Recently, a RA homeostatic mechanism stimulated by retinoic acid gene 8 (Stra8) via an independent pathway has been shown to implicate that Cyp26a1 and Aldh1a2 mediate the meiotic entry of germ cells in Nile tilapia 102 . While Stra8 was absent from the transcriptomes of the studied sparids, Stra6 was present and over-expressed in the ovaries of both species. A search for genes encoding RA receptors (Raraa, Rarab and Rarb) revealed that all three exhibited sex-biased expression profiles. The first two were commonly male-biased, while Rarb was over-expressed in red porgy ovaries.
Four members of the TGF-b superfamily (Gdf9, Gsdf, Amh, Amhr2) were found in both species' transcriptomes and displayed sex-biased expression. In particular, growth differentiation factor-9 (Gdf9), which is required for ovarian folliculogenesis 104 was up-regulated in the ovaries of both species. Our findings are in line with those of another protogynous fish, the bluehead wrasse 38 . The gonadal soma derived factor (Gsdf), a gene probably unique to teleosts and recently identified as the male-specific master sex-determining gene in Oryzias luzonensis 10 was uniquely over-expressed in the ovaries of red porgy, possibly indicating a different role for the Gsdf gene in this protogynous species. A well-known member of the TGF-b family, the anti-Mullerian hormone (Amh) seems to contribute to the common pandora male phenotype, while it showed no sex-biased differences in red porgy. Interestingly, its receptor, Amhr2, is female-biased in the later species. The Amh and Amhr2 are known master SD genes in the Patagonian pejerrey Odontesthes hatcheri 11 and in the tiger pufferfish Takifugu rubripes 12 , respectively. The Amh role in teleost sexual differentiation, which lack Müllerian ducts, is not clear, though it has been proposed that proliferation and differentiation of spermatogonia are the possible common functions of this gene among vertebrates 105,106 . The same gene has also been found among the male-specific genes in some gonochoristic 57,107 and hermaphroditic 35,39 fish and it is involved in the sex change of the protandrous black porgy 90 .
Two major components of the Sox genes 108 (Sox3 and Sox9) were found among the commonly expressed gonadal genes between the two protogynous species. The present study demonstrated that Sox3 has a high female-specific expression in both common pandora and red porgy. In mammals, Sox3 is implicated in the induction of testis formation 109 and spermatogenesis 110 . However, its role in fish sex determination and differentiation is still ambiguous as in some studies it is linked to the male phenotype 14,111 , while in others to the female 38,57,112 . The other Sox gene, Sox9, is a male-biased gene in both species. The direction of expression of Sox9 agrees with the male-specific role of this gene in mammals 113 . Nevertheless, Sox9 has been reported to have a male-biased expression in rudimentary sharpsnout seabream 35 , the protogynous bluehead wrasse 38 and the protandrous clownfish 39 . On the contrary, in the medaka Sox9 is expressed in ovaries 114 . Interestingly, in zebrafish Sox9 has two homologues, one expressed in oocytes and the other in testicular Sertoli cells 115 . Taken together, the results suggest that Sox3 and Sox9 are important in the common pandora and red porgy female and male phenotype formation respectively and suggest both genes as handy molecular players that could contribute to either sex side, in teleosts.
Despite major differences in the genetic control of sexual development in different organisms, Dmrt genes are actively involved in sex-specific differentiation in vertebrates, invertebrates and all teleosts that have been studied so far [116][117][118] . A family member, Dmrt2/Dmrt2a which is linked to neurogenesis 119 and left-right patterning 120 during embryonic development, displayed a female-biased expression shared between species. This agrees with the patterns observed in the Atlantic cod 121 and four cichlids species 57 . On the other hand, the most studied member of this gene family, Dmrt1, was highly up-regulated in the testes of both protogynous species studied. Hence, the results of this study indicate that Dmrt1 together with Sox9 could play a fundamental role in expressing and preserving the male character in gonad. The role of Dmrt1 in sex determination and spermatogenesis in vertebrates and in testicular development and maintenance in teleosts is well established 117 . In line with the current study results, characteristic male-biased expression of Dmrt1 has been reported in numerous gonochoristic 99,[122][123][124] and other hermaphrodite teleosts 33,35,38,39,91 . Dmrt3/Dmrt3a, another male-specific gene in general 125 , appears to contribute to male phenotype of common pandora, as it was strongly expressed in their testes, but it was absent from the red porgy transcriptome.
Another interesting finding from our study was the expression profiles of androgen and estrogen receptor genes (Ara, Arb, Er/Esr1 and Esr2b), as well as estrogen-related receptors (Esrra and Esrrb) in the gonadal transcriptomes of both species. Surprisingly, both androgen receptors were up-regulated in the ovaries of both species and Esrra displayed a common male up-regulation. Although it is still unclear how the endocrine mechanism works to control sex change, steroid hormones are believed to regulate the process 126,127 .
Finally, genes of epigenetic maintainers, such as the DNA-modifying enzymes DNA methyltransferases (Dnmt3aa, Dnmt1, Dnmt3ab) and histone tail-modifying and de-modifying enzymes, such as histone acetyl transferases (Ep300a, Kat2b) or histone deacetylases (Hdac8, Hdac2, Hdac10, Hdac11) were found in transcriptome of both protogynous sparids and showed a sex-biased expression. Further, enrichment analysis revealed genes involved in epigenetic related mechanisms, supporting their implication in sex determination/differentiation and apparently sex maintenance. The relationship of the epigenetic mechanisms to sex determination and differentiation in vertebrates has been clearly stated and these mechanisms are considered to be the link between environmentally controlled gene expression and sex reversal 95 .

Conclusions
In this study, we sequenced the total mRNA from the gonads and brains of both sexes in two protogynous hermaphrodite sparids, the common pandora and the red porgy. This approach permitted us to obtain a global view of sex-specific expression patterns, as well as it allowed us to compare sex-biased gene expression patterns within teleosts having the same reproductive mode. Focusing on the pathways and genes implicated in sex determination/differentiation, we tried to unveil the molecular pathway by which a protogynous hermaphrodite fish develops a masculine or a feminine character. We found minor sex-related expression differences indicating a more homogeneous and sexually plastic brain, whereas there was a plethora of sex biased gene expression in the gonads. This indicates functional divergence of the transformed tissue either to male or female character, even though both types of gonadal tissue originated from the same precursor. We observed the implicated pathways behind sex-biased expression and saw the recruitment of known sex-related genes either to male or female type of gonads in these fish. Taken together, most of the studied genes form part of the cascade of sex determination, differentiation and reproduction across teleosts. In this study, we focused on genes that are active when sex is maintained (sex-maintainers). Comparing related species with the same reproductive style, we saw different combinations of genes with conserved sex-linked roles and some handy molecular players, in a partially conserved network formulating the male and female phenotype. Finally, this analysis lays the ground for understanding the complex process of sex differentiation in these protogynous sparids and makes future comparative analyses of fish with alternative reproductive modes feasible. Future work in gene expression data during sex-reversal will uncover which genes are triggered to block female path and start to transform the gonadal tissue in the opposite direction.

Methods
Sample collection. Animal care was carried out according to the "Guidelines for the treatment of animals in behavioural research and teaching" 128 . The fish used in the study were maintained in registered and approved facilities for the maintenance and carrying of animal experiments and were reared and sampled according to the guidelines of the Directive 2010/63/EU for the protection of animals used for experimental and other scientific purposes (Official Journal L276/33) 129 and the guidelines of Metcalfe and Craig (2011) 130 . In addition, the experimental sampling protocol was approved by the IMBBC's aquaculture department committee and all methods were carried out in accordance with relevant guidelines and regulations approved by the Ministry of Rural Development and Food and the Regional Directorate of Veterinary Medicine for certified experimental installations (EL 91-BIO-04) and experimental animal breeding (AQUALABS, EL 91-BIO-03). Members of the laboratory include certified technicians by the Federation for Laboratory Animal Science Associations (FELASA).
The fish used for the experiments were either wild fish caught alive, or broodstock fish of wild origin in the Institute of Marine Biology, Biotechnology and Aquaculture (HCMR) (Supplementary Excel Tables SET9). In both cases, mature fish were euthanized using the commercially available slaughter method for marine fish (i.e. immersion in ice-slurry) and the brain and gonads were dissected immediately. Only one of the two gonadal lobes was kept and stored until further processing. The fish were examined macroscopically for sexual maturation, based on the presence of releasable sperm for males or the presence of vitellogenic oocytes for females and/or visual inspection of the gonad morphology. For red porgy, nine brain and nine gonad tissues were sampled from five female and four male individuals, while for common pandora ten brain and ten gonad tissues were sampled from five females and five males. In addition, for each species a pool of several hundred larvae at the "hatching" stage (1 days post hatch) were sampled to increase the possibility of capturing the majority of the species transcripts. All tissues and larvae were collected in a sterile and RNase-free way and stored immediately in RNAlater (Applied Biosystems, Foster City, CA, USA). Tissues for RNA extraction were stored at 4 °C overnight according to the manufacturer's recommendations and then transferred to −80 °C until further processing.
RNA extraction and sequencing. To reduce biases regarding different expression in different parts of the same organ, we homogenized the whole brain, one lobe of testis and one lobe of ovaries. All tissue samples, including larvae, were ground with liquid nitrogen using pestle and mortar, homogenized in TRIzol ® reagent (Invitrogen) and total RNA was extracted from the TRIzol ® homogenate according to the manufacturer's instructions. The quantity of the isolated RNA was measured spectrophotometrically with NanoDrop ® ND-1000 (Thermo Scientific), while its quality was tested on an agarose gel (electrophoresis in 1.5% w/v) and further on an Agilent Technologies 2100 Bioanalyzer (Agilent Technologies). The majority of the samples had an RNA Integrity Number (RIN) value higher than 8. This was not the case for one testes from common pandora and it was excluded from the further analysis. In contrast to their high quality agarose image, all ovaries showed lower RIN values. The low RIN values obtained from ovary samples could be due to the abundance of low molecular weight 5 S RNA expression. In a previous study of sharpsnout sea bream 35  carried out with a commercial kit and the female gonads were in an early maturing stage, the 5 S rRNA was in such abundance that it masked the 18 S and 28 S rRNA peaks thus hampering the correct estimation and calculation of RNA integrity numbers (RIN). This low molecular weight band of 120 bp, belonging to the smallest rRNA molecule, was also detected in the ovaries and intersex gonads of thicklip grey mullet Chelon labrosus, where the role of 5 S rRNA in fish sex differentiation and its potentiality as a sex marker in fish gonads has been thoroughly described 131 . This type of RNA profile was also observed in female gonads of bluehead wrasse 38 and very recently a study described the use of ribosome RNA profile not only as a sex identifier but also as an indicator of ovarian developmental stage 132 . Finally together with two larval samples, 39 samples were used for mRNA paired-end library construction with the Illumina TruSeq TM RNA Sample Preparation Kits v2 following the manufacturer's protocol (Poly-A mRNA isolation with oligo-dT beads, mRNA fragmentation, followed by transcription into first-strand cDNA using reverse transcriptase and random hexamer primers) and sequenced as 100 bp paired reads in three lanes (1,5 lanes per species) of a HiSeq. 2500 TM following the protocols of Illumina Inc. (San Diego, CA).
Read pre-processing. All computations were performed at the high-performance computing bioinformatics platform of HCMR. Read quality was assessed with FastQC 133 and subjected to quality control following a pipeline using Scythe 134 , Sickle 135 , Trimmomatic 136 and Prinseq (prinseq-lite-0.20.3) 137 . We first used Scythe -A Bayesian adapter trimmer (version 0.994 BETA) 134 , which uses a Naive Bayesian approach to classify contaminant substrings in sequence reads; it considers quality information which can make it robust in picking out 3′-end adapters (which often include poor-quality bases) and we therefore ran it before any quality-based trimming (with the prior contamination rate set in 0.1 '-p 0.1'). Low-quality (parameters 'pe -g -t sanger -q 20 -l 45') nucleotide reads trimming was implemented with Sickle 135 , a sliding-window, adaptive, quality-based trimming tool for FastQ files. Using Trimmomatic 136 , some extra filtering steps and also both 5′ and 3′ adaptor removal were performed (parameters 'PE -phred33 ILLUMINACLIP:adapter_file.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:25 MINLEN:45 CROP:99'). Finally, low-complexity sequences (threshold entropy value of 30) and poly A/T 5′ tail (minimum of 5 A/T) trimming was performed using PrinSeq 137 . Lastly, the reads of each pair of all the libraries were combined using custom Perl scripts.
Transcriptome assembly construction. The de novo assemblies for red porgy and common pandora were performed using Trinity package 138 (version 2.0.6) with default parameters (default kmer 25, min length 200 nucleotides). To increase transcriptome coverage, de novo assemblies were carried out on a pool of the pre-processed reads from all samples. To assess the assembled transcripts and exclude the spurious ones, we pooled all filtered reads and mapped them to the selected assemblies (Pagrus_Trinity_assembly.fasta and Pagellus_Trinity_assembly.fasta) with Bowtie2 51 within RSEM (version 1.2.19) 139 , using the script available in trinity utilities align_and_estimate_abundance.pl. Putative transcripts with less than 1% of a locus reads mapped to that particular isoform (IsoPct <1) were excluded (as proposed in) 140 . The same was done for those with Fragments Per Kilobase of transcript per Million mapped read (FPKM) values less than 0.3. The choice of FPKM threshold was based on BLASTn similarity searches (e -value threshold 10 −10 ) against European sea bass cDNA sequences (http://seabass.mpipz.mpg.de). We first contacted a blast search of the unfiltered assemblies against the European sea bass cDNA sequences and then repeated the blast and tested three differently filtered assemblies (a) FPKM_cutoff = 0.3, isopct_cutoff = 1.0, (b) FPKM_cutoff = 0.5, isopct_cutoff = 1.0 and (c) FPKM_cutoff = 1, isopct_cutoff = 1.0. To assess the assembled transcripts quality, we followed three methods: (a) pooling all the reads and mapping them to the assembly using Bowtie2 51 and Samtools 141 to estimate the percentage of mapped reads, (b) examining the similarity between the Trinity assembly and swissprot database using blastx search and (c) quantifying the completeness of the transcriptome assembly and determining how many of the 3,023 genes for vertebrates (OrthoDBv8 was used from BUSCO) were present in our two species transcriptomes, using BUSCO (Benchmarking Universal Single-Copy Orthologs) software 142 that searches for highly conserved, near-universal, single copy orthologs. All BLAST 143 searches were performed in parallel using NOBLAST program 144 . Transcriptomes functional annotation. For the annotation of the assembled transcripts, we conducted a BLASTx similarity search against UniProtKB/SwissProt database 52 (e-value threshold 10 −5 ) keeping the top 20 hits. BLASTx was done in parallel using NOBLAST 144 . For gene ontology (GO) mapping, Blast2GO 54 was used locally to recover all the GO terms associated to the hits obtained by the blast search. After the mapping step, results were subjected to GO annotation, a process of selecting GO terms and assigning them to the query sequences for describing biological processes, molecular functions and cellular components. Gene ontology graphs were generated in "level 2" for GO plot visualization and pie charts for the whole transcriptome profiling. Further, brains and gonads of each species were analyzed separately, using in-house R scripts. The Blast2GO output file was input into the BGI WEGO program 55 and GO annotations were plotted. The sequences were additionally annotated using InterPro 145 ; a local InterProScan (version 5, InterPro release 48.0) 53 was run in parallel (splitting the query set) on the longest open reading frame (ORF) of the contigs using a custom Perl script based on the EMBOSS program "getorf " 146 . GO terms corresponding to these InterPro domains, were merged with the already existing GO terms derived from the blastx against UniProt. The GO terms were modulated using the annotation augmentation tool ANNEX 147 followed by GOSlim 148 . Enzyme classification (EC) codes were obtained through the direct mapping of GO terms to the corresponding enzyme codes. Sequences having EC numbers were further characterized by Kyoto Encyclopedia of Genes and Genomes (KEGG) 149 metabolic pathway annotations using custom Perl/R scripts. Differential expression analysis. Differential expression analysis was performed at the "gene" (component/unigene) level by pairwise comparisons between male and female for brain and gonad samples. The de novo transcriptome assembly of each species served as a reference for read mapping. For the quantification, the cleaned reads of gonads and brains (not larvae) of each sample of a species were aligned to the species' transcriptome assembly with Bowtie2 51 and abundance was estimated with RSEM, as implemented in the trinity script align_and_estimate_abundance.pl. The estimated expected counts for each sample, when the sum of counts in each row >15, were extracted 150 . Loci with less than fifteen reads mapped on them, were excluded prior to differential expression analysis to improve the statistical power. Count data were imported in R (version 3.2.0) and the analysis of differential expression conducted in DESeq. 2 (version: 1.8.1) 151 , an R bioconductor package 152 . Samples of both species were grouped according to sex and expression was compared separately for brains and gonads, following the developers' manual (false discovery rate, FDR, threshold of 0.05). Principal component analysis (PCA) 153 and the heatmap.2 function in the gplots package 154 were used to visualize global similarities and differences among either the brain or the gonadal samples. The global expression analysis conducted with sex and tissue being the (design/variable/intgroup) factors influencing the counts in a multifactor analysis. Each species global (brains and gonads) PCA plot was produced including all the genes in the filtered (sum of counts in each row >15) count matrix. Genes with an adjusted p value less than 0.05 were considered as significantly differentially expressed.
Functional GO and pathway enrichment analysis. Functional-enrichment analysis was performed to identify GO terms and metabolic pathways significantly enriched in differentially expressed genes in brains and gonads, in the two sparids. In each species, male and female up-regulated gonad and brain gene sets were tested for enriched GO terms, in a comparison of female versus male (female/test-set vs male/reference-set). Particularly for gonads, we restricted the gene-set to only those with |log 2 FC| >2. We also conducted enrichment analysis for common (between species) male-biased and female-biased gene set using both 'red porgy' and 'common pandora' total datasets as reference set of the annotated transcripts. All enrichment analyses were implemented in BLAST2GO with default settings enabling for a two-tailed test and reducing the output table to the most specific terms with FDR = 0.05.
Targeted search for sex-related candidate genes. To identify transcripts related with the GO terms "reproduction" (GO:0000003), "reproductive process" (GO:0022414), "sex determination" (GO:0007530) and "sex differentiation" (GO:0007548), both annotated transcriptomes were searched for the presence of these GO terms and the retrieved genes were checked for whether they exhibit a male or female-biased expression in brains and gonads. Genes known to be involved in sex determination/differentiation in vertebrates and other fish taxa were downloaded for tilapia, medaka and spotted gar Lepisosteus oculatus from Ensembl (release 84) 155 , resulting in a set of 114 candidate genes. To find out if orthologs of these genes were present in our species transcriptomes, we used a reciprocal BLASTn best-hit approach 156 . The potentially orthologous genes were then tested based on the differences in their expression patterns between testes and ovaries of both species by using the output files of DeSeq. 2. Those that were differentially expressed in both species or were species-specific were identified.