Regulation of transcription factors on sexual dimorphism of fig wasps

Fig wasps exhibit extreme intraspecific morphological divergence in the wings, compound eyes, antennae, body color, and size. Corresponding to this, behaviors and lifestyles between two sexes are also different: females can emerge from fig and fly to other fig tree to oviposit and pollinate, while males live inside fig for all their lifetime. Genetic regulation may drive these extreme intraspecific morphological and behavioral divergence. Transcription factors (TFs) involved in morphological development and physiological activity may exhibit sex-specific expressions. Herein, we detect 865 TFs by using genomic and transcriptomic data of the fig wasp Ceratosolen solmsi. Analyses of transcriptomic data indicated that up-regulated TFs in females show significant enrichment in development of the wing, eye and antenna in all stages, from larva to adult. Meanwhile, TFs related to the development of a variety of organs display sex-specific patterns of expression in the adults and these may contribute significantly to their sexual dimorphism. In addition, up-regulated TFs in adult males exhibit enrichment in genitalia development and circadian rhythm, which correspond with mating and protandry. This finding is consistent with their sex-specific behaviors. In conclusion, our results strongly indicate that TFs play important roles in the sexual dimorphism of fig wasps.

. The genetic mechanisms driving these sex-specific differences have remained largely unknown. Considering the differences between male and female fig wasps 7 , their organ development, energy metabolism and physiological demands may also differ. The roles played by transcription factors (TFs) in sexual dimorphism are critical for understanding the system 8 . For one TF may include several domains with different functions in an organism 9 , it is also essential to consider TFs in the domain level.
High-throughput sequencing is an important tool for gene expression study. Herein, we apply RNA-Seq to four different developmental stages of the fig wasp Ceratosolen solmsi to explore the relationship between TF-expression and sex-specific traits. The genome of this fig wasp, which pollinates Ficus hispida, has been sequenced recently 10 . Herein, we mainly focus on transcriptomes to investigate the genetic drivers responsible for phenotypic and behavioral differences. Our results imply that sexual dimorphism closely associates with sex-specific expressions of TFs. These analyses provide insights into the morphological divergence between male and female fig wasps.

Results
TF distributions in three species of Hymenoptera. We employed the total functional genes of C. solmsi, Nansonia vitripennis and Apis mellifera to identify their TFs. Herein, we both employ homologous sequence alignment with known TFs of Drosophila melanogaster 11,12 and predict them from the DNA-binding domain (DBD) database ( Figure S1). In total, predictions indicate that C. solmsi has 865 TFs, N. vitripennis 1189 and A. mellifera 880. Overall, TFs in C. solmsi and A. mellifera have similar numbers of TFs. The TFs domains information of C. solmsi, N. vitripennis, A. mellifera and D. melanogaster were list in Table 1. Among these TFs, zf-C2H2 exhibits the greatest number of domains among the three hymenopterans. Homeobox domains constitute the second most common class of domain in the hymenopterans, as it does in D. melanogaster 13 . In contrast, TFs with rve and RVT domains occur only in N. vitripennis. These domains were shown to be necessary for most retrotransposons and retroviruses 14,15 . The possible degradation of retrotransposons in genomes of C. solmsi and A. mellifera may owe to the loss of their rve and RVT domains 10,16 .  (Fig. 1A). Notwithstanding, C. solmsi appears to have lost a few TFs with these domains. The number of DBDs per TF (Fig. 1B) is coincident in C. solmsi and N. vitripennis. In both species, most TFs contain only one DBD. Further, the number of DBDs per TF relates inversely to the proportion of that class of TF. In contrast, the largest proportion of TFs in A. mellifera has 2 DBDs and this accounts for the relatively rich TF-function observed in A. mellifera. These findings may reflect the complicated social groups of A. mellifera. Moreover, TFs with 5 or more DBDs occur in N. vitripennis and A. mellifera, but not in C. solmsi. Thus, C. solmsi possesses a relatively simple composition of TFs in comparison with N. vitripennis and A. mellifera; this also corresponds with morphological reductions.
TFs functions related to morphological divergence. To classify TFs functions in the fig wasp, we evaluate the Gene Ontology (GO) term for each TF and look for their functional enrichment. In addition to fundamental biological functions, such as regulation of transcription, gene expression and RNA metabolic process, many TFs of fig wasps associate closely with wing, eye, antenna, limb, appendage and photoreceptor cell development, and other aspects of sexual dimorphism ( Fig. 2A, Table S1). Among them, some TFs participate in multiple functions. For example, a few TFs participating in eye development also associate with photoreceptor cell development ( Fig. 2A). The 53 TFs associated with wing development assign to 7 related GO terms: wing disc morphogenesis, wing disc pattern formation, wing disc dorsal/ ventral pattern formation and so on (Fig. 2B). Similar to wing, a total of 60 TFs participate in eye development and these belong to 14 associated GO terms, such as eye development, compound eye development, eye morphogenesis and compound eye morphogenesis (Fig. 2C). Furthermore, 25 TFs relate to photoreceptor cell development, 19 TFs participate in antenna development and 79 TFs closely associate with sex, such as sexual reproduction, sex differentiation, sexual characteristics and sexually dimorphic development (Fig. 2D). We further examine whether their potential targets connect with respective GO terms, and the results well demonstrate that dimorphic sexually related functions occur in the potential targets of TFs. Similar to TFs, among their potential targets, genes with functions related to wing, eye and other organ development also show enrichment ( Figure S2).
Genes and TFs expression in two sexes across four developmental stages. RNA-Seq was used to investigate the expression of genes and TFs of the sexes in four developmental stages: larva, early pupae (21 st day), late pupae (25 th day) and adult. Approximately 20 M reads of each sample were mapped to the genome with high quality (Q > 20). Reads were assigned to transcripts based on their overlap with 11412 previously defined reference gene models 10 . Similar to our previous result 10 , most genes were highly expressed (with reads per kb per million mapped reads (RPKM) > 10) in the early pupa stage ( Figure S3A). Following the development of fig wasps, the number of highly expressed genes reduced in the late pupa and adult stages. Excluding early pupae, only about half of all genes were highly expressed and adult males exhibited the lowest number of expressed genes.
TFs cluster into basal and sequence-specific TFs 17 . Basal TFs play a supplementary role in regulating gene expression 18 , while sequence-specific TFs are involved in the regulation of specific target genes 19 . In fig wasps, sequence-specific TFs exhibit similar expression profiles to whole genes (Fig. 3A). The expression profiles of basal TFs consistently show high levels of expression in females at all the four stages, but exhibit low expression (RPKM < 5) in males in the late pupae and adult stages compared to females (p < 0.001) (Fig. 3B).

Figure 2. Function of transcription factors (TFs) related to wing, eye development and sexual differentiation of C. solmsi.
A. Enrichment analysis of TFs function. Total TFs were subjected to Gene Ontology (GO) analysis. An enrichment map was constructed by using Cytoscape installed with the Enrichment Map plugin. Red node represents each enriched GO pathway (p < 0.01, FDR q < 0.05, overlap cutoff > 0.5). Node size proportional to the total number of genes in each pathway. Edge thickness represents the number of overlapping genes between nodes. GO pathways of similar functions were sorted into one cluster, marked with circles and labels. Gene numbers in each cluster are labeled. B C and D indicated the number of GO terms classes related to wing, eye development and sexual differentiation, respectively.
Scientific RepoRts | 5:10696 | DOi: 10.1038/srep10696 The number of expressed genes differs substantially among the developmental stages based on our conserved criterion that a false discovery rate (FDR)-corrected p-value must be less than 0.001 and the fold change greater than 2 ( Figure S3B). Larval females have only 612 and 671 genes significantly up-or down-regulated compared to males, respectively. Early pupae females have 1126 and 482 significantly up-or down-regulated genes, respectively. The late pupae and adult stages differentially express a large proportion of genes. In late pupae, 4245 and 465 genes are up-and down-regulated in females, respectively; up-regulated genes are about 10 fold greater than down-regulated ones. This may relate to female-specific pupae development. In adult females, 4044 genes are up-regulated and 1813 are down-regulated significantly. Overall, late pupa and adult females abundantly up-regulate their genes. Intersex patterns of TF expression in the four stages are similar to those of whole genes (Fig. 3C). In the late pupae and adult stages, more than half of the total TFs are differently expressed. Females have more up-regulated TFs than down-regulated ones across all four development stages, especially in the late pupae and adult stages. Adults have the most up-regulated TFs among male developmental stages (Fig. 3C).
Up-regulated TFs related to female wing, eye and antenna development. GO terms serve to identify the functions of enriched TFs in female developmental stages. Further, we explore function enrichment within up-and down-regulated TFs during four development stages in females. In all stages, wing and eye development-related TFs show significant enrichment in up-regulated TFs in females. Thus, these organs may develop from the larva stage, and that development occurs until the adult stage (Fig. 4A). Most of these TFs have the highest levels of expression in early pupae (Fig. 4B). In both sexes, levels of TFs expression decreased in adults compared to other three stages and only a few TFs displayed sex-specific expression (Fig. 4B). The expression patterns of TFs that participate in limb and leg development in females are similar to those of wing and eye ( Figure S4).
In contrast, the TFs associated with antenna development differ from those of wing and eye; few TFs have high expression levels in adult females, indicating that antenna development may be completed in the late pupae stage (Fig. 4A). In contrast to females, males do not exhibit functional enrichments of TFs

Sex-differentially expressed TFs in adults.
The functional enrichment of up-regulated TFs in male adults indicates that metamorphosis, sensory organ development, imaginal disc development and appendage development have significant enrichments. However, up-regulated TFs in females also have these functional enrichments, but females have fewer than males (Fig. 5A). The enrichment of these functions is significantly higher in males than females yet females have larger total number of up-regulated TFs (Fig. 5B). In either sex, most of these up-regulated TFs mainly involve sex-specific expressions (Fig. 5C). TFs associated with wing, eye, antenna, limb, leg and the development of some other organs display sex-specific high levels of expression in adults and this may contribute significantly to adult sexual dimorphism. Up-regulated TFs in adult males involve enrichment in pigmentation, gland morphogenesis, segment specification, genitalia development, sexual characteristics, gonad development, germ cell migration, response to hormone stimulus, circadian rhythm and other functions. These GO terms do not show significant enrichment in up-regulated TFs of adult females (Fig. 5D). Most TF expression profiles for these functions indicate low levels of expression in females. This corresponds with being male; up-regulated TFs participating in circadian rhythm may relate to protandry. In contrast, female up-regulated TFs associate with eggshell chorion assembly, ovarian follicle cell development, eggshell formation, oogenesis, female gamete generation, reproductive developmental process, and sexual reproduction; these involve female reproduction. Up-regulated TFs in females also have significantly enriched TFs in a variety of biosynthetic processes such as metabolic process and neuron differentiation function, among others. This closely associates with their greater energy requirements and physiological needs.

Discussion
Accurate and comprehensive predicting of TFs may be critical for understanding sex-specific differentiation. Identified TFs of D. melanogaster facilitate the predicting of TFs in C. solmsi, N. vitripennis and A. mellifera. These predictions allow explorations into the evolution of TFs in Hymenoptera, and in particular those of C. solmsi. Some TF domain-numbers in D. melanogaster differ from those of the hymenopterans. For example, TFs with BESS domains are less prevalent in the Hymenoptera than in D. melanogaster. BESS domains in TFs tend to interact with MADF domains. Typical, this interaction occurrs in TF Dip3, which possesses an N-terminal MADF domain and a C-terminal BESS domain. This structure was reported to be conserved in at least 14 proteins of Drosophila 20 . Hymenoptera also possess this combination of domains, but the BESS domains with some degradation when compared to D. melanogaster. In contrast, TFs with DnaJ and Ank domains are less common in D. melanogaster than in the three hymenopterans. This indicates the expansion of TFs with DnaJ and Ank domains. In addition, TFs with PCI domains occur only in Hymenoptera, indicating that they may constitute important adaptations to hymenopteran ecology. For three hymenopteran species, most TFs in fig wasps also occur in other hymenopterans, indicating ancient origins and vital roles these TFs play. In addition, C. solmsi possesses fewer TFs and DBDs than do N. vitripennis and A. mellifera. This may be a consequence of their highly specialized ecology and strict host-specificity. In general, host-specific endosymbionts such as C. solmsi show specialized adaptations to life inside their hosts, such as a series of morphological reductions, especially in males. This may relate to reductions in some TF categories . Fig wasps have fewer TFs and greater morphological reductions compared to the other two hymenopterans.
The GO analyses indicated some TFs closely associate with wing, eye, antenna and photoreceptor cell development, sex differentiation. Effectively, all of these TFs likely affect sexual dimorphism. Regarding behavioral divergence of females and males, some TFs participate in biosynthetic process, metabolic process, sensory organ development, neuron differentiation, hormone stimulus and circadian rhythm, which contribute significantly to behavioral divergence.
In all four stages of development, wing-and eye development-related TFs show significant enrichment in up-regulated TFs of females. In addition, TFs that participate in limb, leg development, have patterns of expression that are similar to those of wing and eye. Differing from wing and eye, few TFs associated with antenna development highly expressed in adults; antenna development may reach completion in the late pupae stage. Therefore, sexual dimorphism likely closely associates with up-regulation of TFs in females. A variety of organ-development TFs display sex-specific expression patterns, which may contribute significantly to adult sexual dimorphism. High levels of expression of these TFs in females in early development stage may associate with their relatively complicated morphologies compared with males. Similar expression patterns of TFs and total genes may suggest that these TFs play a critical role in the control of transcriptional regulation. Adult males chew small holes in the walls of galls to enable their mating with females and some males chew holes in the fig wall to enable fertilized females to fly to another fig tree to oviposit and in doing so pollinate figs. Therefore, the adult lives of males and females differ substantially, as do their morphologies 21 . Adult females have a similar numbers of up-and down-regulated TFs and most of these differentially expressed TFs display sex-specific high levels of expression in two sexes, suggesting that these TFs may relate significantly to adult sexual dimorphism, such as TFs related to variety of organ-development. Meanwhile, significant up-regulated or sex-specific expression of TFs also plays an important role in the behavioral divergences of adult fig wasps. Adult males show enrichment in up-regulated TFs involved in pigmentation, gonad development, germ cell migration, response to hormone stimulus, circadian rhythm and other characteristics that associate with male-life and protandry. In contrast, female TFs associated with oogenesis and sexual reproduction are up-regulated significantly, which also relates to their reproduction. Enrichment of female up-regulated TFs occurs in a variety of biosynthetic processes including metabolic process, which closely associates with their energy requirements and physiological needs. Moreover, both females and males have a few TFs with relatively stable levels of expression across all four stages. Excluding functions related to wing and eye development, for some of them also participate in necessary biological processes, such as the regulation of transcription and RNA metabolic process, which may account for their stable expression. Fig wasps, which have very small bodies, cannot be cultivated ex situ. Their larvae and pupa live inside figs and their sex only can be determined using molecular tools. These attributes make it virtually difficultly to carry out multiple replicates to us, which constrains the strength of our conclusions. Consequently, we use DEGseq 22 , which is suitable for no replicate samples analyses to verify the reliability of the results in the absence of duplicate samples. We use the following strict standards to define differential gene expression: FDR less than 0.001 and fold change greater than 2. Most of candidate gene with differential expression have a fold-change much greater than 2. Thus, qualitative analysis of up-or down-regulation conform to our strict standards. Regardless, herein we provide a new perspective on understanding sexual dimorphism at the level of TFs.
In conclusion, this study provides new insights into the evolution of TFs in fig wasps and sexual dimorphism closely associates with TFs. Further experimentation on these TFs and their potential targets will lead to a more complete understanding of the functional roles TFs play in determining the remarkable sexual dimorphism of fig wasps.

Methods
TF prediction in C. solmsi. TFs were divided into basal TFs and sequence-specific TFs 17 . Basal TFs played a supplementary roles in regulating gene expression 18 while sequence-specific TFs were shown to be involved in the regulation of specific target genes 19 . Hence, the methods of prediction differed between the two types of TFs.
Predictions for basal TF were made based on conserved domains. First, basal TFs of D. melanogaster were used to find the basal TF domains by employing Pfam 23  The DBD database (http://www.transcriptionfactor.org) was used to predict sequence-specific TFs and the number of predicted transcription factor repertoires in the DBD database increased from 150 in the initial version to over 700 at present [24][25][26] . Because this coverage was not sufficiently comprehensive, we also carried out homologous sequence alignment against TFs of D. melanogaster. In our analysis, each protein sequence  wasp, we searched them against the NR database using BLAST to confirm their TF functions and domains. Thus, we predicted TFs by combining two methods ( Figure S1). The DBD-based method predicted a total of 573 sequences as being TFs. After manually removing 40 that were unlikely to be TFs (e.g., enzymes, transmembrane proteins) 26 , we obtained 533 TFs, including 168 new TFs. Using homologous sequence alignments with TFs of D. melanogaster 28  Evolutionary analyses. To summarize relationships of the TFs in different species, multiple alignment was performed using CLUSTALW2 29 . The raw alignment was corrected manually, and unreliably aligned regions containing gap positions were deleted using GBLOCKS 30 . Maximum likelihood trees were constructed using Phyml 31 with the best-fit evolutionary model suggested by Prottest 3.0 32 . Branch support values were gained by performing 1,000 bootstraps.
RNA extraction and sequencing. The transcriptomic data employed herein included eight samples obtained from males and females at four developmental stages: larval (16 days after oviposition); pupal 21 (21 days after oviposition, which is the first day of the pupal stage); pupal 25 (25 days after oviposition, which is the first day of the yellow pupal stage); and adult (30 days after oviposition. Most wasps exit the galls as adults, although they do not emerge from the fig) 10 . Total RNA was isolated using the RNeasy® Micro Kit (Qiagen, Shanghai, China) and treated with DNase (Qiagen, Shanghai, China). A NanoDrop ND-1000 Spectrophotometer (Nano-Drop Technologies, Wilmington, DE, USA) was used to confirm adequate RNA concentration and A260/A280 ratio. RNA was dissolved in 20 μ L RNase-free water and kept at − 80 °C. Larval females and males that had no distinct morphological divergence were discriminated by the variable splicing pattern of the sex determination gene doublesex 33  For RNA-seq, beads with oligo(dT) were used to isolate poly(A) mRNA. Fragmentation buffer was then added for cutting mRNA into short fragments, which were used as templates. Random hexamer primers were used to synthesize first-strand cDNA. Second-strand cDNA was synthesized using a mixture of buffer, dNTPs, RNase H, and DNA polymerase I. Short fragments were purified with QiaQuick PCR extraction kits and resolved with EB buffer for end repair and addition of poly(A). Next, the short fragments were connected with sequencing adaptors. For amplification with PCR, we selected suitable fragments as templates based on agarose gel electrophoresis. Finally, the libraries were sequenced using an Illumina HiSeq™ 2000.
Genome data and RNA-Seq data analyses. The genomes of N. vitripennis and A. mellifera 16,34 were available. The genome and transcriptomes of C. solmsi were recently sequenced 10 . The adaptor sequences on raw RNA-seq reads for each sample were stripped by Cutadapt (http://code.google.com/p/cutadapt/) and low quality bases were removed by Trimmomatic 35 . Reads that were less than 20 nt in length or that contained an ambiguous nucleotide were discarded. The remaining reads were aligned to the C. solmsi genome using TopHat2 while allowing for up to two mismatches 36,37 . Uniquely mapped reads with a mapping quality larger than or equal to 20 were retained for subsequent analyses. The number of reads mapped to each gene in each RNA-seq sample was counted using the HTSeq python package (http:// www-huber.embl.de/users/anders/HTSeq/doc/overview.html) with the 'union' overlap resolution mode and -stranded = no. The reads-count for each gene was input for the R-package DEGseq to calculate the significantly differentiated expressed genes (Benjamini-Hochberg false discovery rate (FDR) ≤ 0.001 and fold change > 2) with the MARS method 22 . For each sample, RPKM was computed as the number of reads which map per kilobase of exon model per million mapped reads for each gene 38 . Expression profiles were generated with Cluster 3.0 39 and visualized with Java Treeview 40 . Because global patterns of expression were published previously 10 , herein we considered the dominant roles of TFs played in regulating gene expression while focusing on TF-expression and function enrichments related to sexual dimorphism.
Gene Ontology Enrichment. The GO term for each TF was retrieved by applying blast2go 41  obtained. Then, their GO term were also retrieved by blast2go 41 .
Gene molecular functional enrichment in different gene subsets was determined using Generic Gene Ontology (GO) Term Finder 48 . GO Term Finder located significant GO terms in a list of genes and each gene product may have been represented by three independent structured and controlled vocabularies: molecular function, biological process and cellular component. GO terms with a p value of less than 0.01 were determined to be statistically significant. An enrichment map of each specific genes constructed by Cytoscape 2.8.3 installed with the Enrichment Map plugin and the parameter is as follows: p < 0.005, FDR q < 0.05, overlap cutoff > 0.5 49 .