Gene set of chemosensory receptors in the polyembryonic endoparasitoid Macrocentrus cingulum

Insects are extremely successful animals whose odor perception is very prominent due to their sophisticated olfactory system. The main chemosensory organ, antennae play a critical role in detecting odor in ambient environment before initiating appropriate behavioral responses. The antennal chemosensory receptor genes families have been suggested to be involved in olfactory signal transduction pathway as a sensory neuron response. The Macrocentrus cingulum is deployed successfully as a biological control agent for corn pest insects from the Lepidopteran genus Ostrinia. In this research, we assembled antennal transcriptomes of M. cingulum by using next generation sequencing to identify the major chemosensory receptors gene families. In total, 112 olfactory receptors candidates (79 odorant receptors, 20 gustatory receptors, and 13 ionotropic receptors) have been identified from the male and female antennal transcriptome. The sequences of all of these transcripts were confirmed by RT-PCR, and direct DNA sequencing. Expression profiles of gustatory receptors in olfactory and non-olfactory tissues were measured by RT-qPCR. The sex-specific and sex-biased chemoreceptors expression patterns suggested that they may have important functions in sense detection which behaviorally relevant to odor molecules. This reported result provides a comprehensive resource of the foundation in semiochemicals driven behaviors at molecular level in polyembryonic endoparasitoid.

Bombyx mori, and OR7 in mosquitoes, but these nonconventional OR have been universally named as olfactory receptor co-receptor (Orco) 29 . The ORs detect a variety of odor compounds 30,31 , including pheromones 32 and microbe derivative or plant volatile compounds 33 . Some of ORs are characterized by their response specificity 33 , whereas others appear more broadly tuned at high stimulus concentrations 30 .
GRs are mostly expressed in gustatory receptor neurons in taste organs involving in contact chemoreception 34 . Insect GRs and ORs are distantly related members of the same superfamily 9 . The GRs are more conserved in sequence and structure than ORs 35,36 , probably due to a comparatively smaller search space on associated cues. These GRs typically detect sugars, bitter compounds, and contact pheromones 37 .
IRs were discovered in D. melanogaster by bioinformatic analyses as another class of receptors involved in chemoreception 5 . Apparently, IRs are related to ionotropic glutamate receptors (iGlurs), which are involved in synaptic signal transduction in both vertebrates and invertebrates 20,34 . Unlike ORs, IRs have been identified throughout protostomia (including arthropods, mollusks, annelids and nematodes) and, thus, constitute a far more ancient group of receptors than the ORs 19 . Due to the IRs have atypical binding domains that are more conserved than ORs, it is possible to identify several paralogous lineages among insects 20 . IR-induced responses appear to be conferred by assemblies of variable subunits in a heteromeric receptor, as up to five different IRs can be co-expressed in a single OSN 5 . A functional complex formed by two or more subunits of IRs, including odor-specific receptors and one or two broadly expressed receptors (in D. melanogaster, IR25a and IR8a) that function as co-receptors 38 . IRs in insects are divided into major two types: the "antennal IRs" are conserved across insect orders with chemosensory function, and the "divergent IRs" is species-specific and assigned a tentative role in taste 19 .
Identification of receptors gene families has largely been possible in insects of which genomic data are available due to their abundance and sequence diversity 17,47 . With the recent advances in RNA-Seq and computational technologies has been used widely such type identifications in non-model organisms. Usages these technologies, a wide range of insects olfactory genes have identified and reported of which no sequenced genome is available 20,34,[47][48][49][50][51] . Within the Macrocentrus cingulum Brischke (Hymenoptera: Braconidae), very limited number of olfactory genes including co-receptors 52 have been identified. However, only one candidate gene from OBP family (McinOBP 1) has been identified with function study 1 , while others candidate genes remaining to be identified.
The identification of olfactory receptors genes families-ORs, GRs and IRs in polyembryonic endoparasitoid wasps will provide information regarding their chemical communication and it's crucial for genetic manipulation of their sensitivity to chemical cues using in biocontrol systems. This research investigated the antennal chemosensory gene families of the M. cingulum by antennal transcriptomes from next-generation sequencing. M. cingulum is a polyembryonic endoparasitoid of the Asian corn borer, Ostrinia furnacalis (Guenée) (Lepidoptera:   (Fig. 1A). From the Nr annotation, 39.8% unigenes showed 60-80% and 37.8% unigenes showed 80-95% similarity with known proteins (Fig. 1B). Nr database queries revealed that 65.5% sequences closely matched to hymenopteran sequence (Microplitis demolitor 51.7%, Nasonia vitripennis 4.0%, Apis dorsata 2.8%, Cerapachys biroi 2.7 and Megachile rotundata 4.3%) (Fig. 1C). From Gene Ontology (GO) annotation the M. cingulum antennal transcriptome unigenes (10,781 of 41,254 unignes) were associated with GO terms which cover three domains: biological process, cellular component and molecular function (Fig. 2, and supplementary Fig. S1). In the terms of biological process; cellular, metabolic and single organism processes represented most of genes. In the molecular function terms, mostly associated with binding activity (e.g. nucleotide binding, odorant binding, ion binding), and catalytic activity (e.g. hydrolase and oxidoreductase activity). In the cellular category, cell, and cell part were the most abundant (Fig. 2).
In total 5,994 unigene sequences are annotated to 256 pathways using with Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database to identify the metabolic pathways which are populated by these unigenes. The "metabolic pathway" populated with highest number of unigenes (1,578, 26.33%) followed by Cellular Processes pathways (1,422, 23.72%); and "Organismal Systems pathways" (1,400, 23.35%) (Fig. 3). This annotation information helps to conduct further research on metabolic function and pathways, and biological behaviors of M. cingulum genes.

Identification of olfactory receptor gene families.
Odorant receptors. Bioinformatic analysis of the M. cingulum antennal transcriptomes identified 109 sequences including previously described ORs McinOR1-9 and McinOrco 52 that encode candidate OR genes. The transcript name, length, best BLASTx hit, identity, and male or female specificity was summarized in Table 4. While the length of 20 other amino acid sequences ≤100 are provided as supplementary material in Table S2. A full-length McinOrco gene coding 479 amino acids was easily identified because it contained the intact open reading frame and seven transmembrane domains, which are typical characteristics of insect ORs. The majority of partial length   Continued transcripts possess overlapping regions with low amino acid sequence identity, indicating that they represented separate individual proteins. However, the possibility that the remaining non-overlapping transcripts represented fragments of individual proteins cannot be excluded; therefore, the total number of McinORs reported could be reduced by 20, based on sequence alignments and subsequent fragment location (i.e. C-terminus, internal, or N-terminus). We eventually analyzed 89 OR (including our previous identified ORs/Orco) sequences in our phylogenetic analysis. With exception of Orco, the predicted ORs shared quite low identity probably due to the high variance among OR gene family. Only three of 79 ORs (McinOR12, McinOR15, and McinOR42) showed more than 50% identity with known ORs in NCBI database ( Table 4). The phylogenetic analysis also showed that ORs were extremely divergent between species but formed monophyletic group within same species (Fig. 4). However the highly conserved Orco shared 95-99% amino acid sequence identity and clustered in same branch with orthologous relation among three species (Fig. 4) Gustatory receptors. We identified 20 transcripts encoding candidate GRs in the M. cingulum antennal transcriptome ( Table 5). Most of candidate McinGRs were partial transcripts (only six represents full length protein), encoding overlapping but distinct sequences. This shows individual genes though it's being fragment of protein sequence. A phylogeny was built with these 20 McinGRs, N. vitripennis, A. mellifera and D. melanogaster (Fig. 5). Based on the phylogenetic analysis, McinGRs were also observed to group with their presumed Drosophila orthologues, which have been shown to have roles in carbon dioxide detection (GR21a and GR63a) 57,58 and are members of the candidate sugar GR64 receptor subfamily (GR64e) 59 or bitter (DmelGR93a) 60 Drosophila receptors (Fig. 5).
Ionotropic receptors. We identified 13 transcripts for putative ionotropic receptors in M. cingulum antennal transcriptome according to their similarity to IR sequence of other insects. Comparative analysis revealed that one candidate IR (MmedIR8a) was deemed as IR8a homolog to its high identity (71%). IR25a and IR76b shared 57% and 46% identity with MmedIR25a.1 and MmedIR76b, respectively. It has been reported that, the above three  genes (IR8a, IR25a and IR76b) are thought to play function as IR co-receptors 5,38 . In the phylogenetic tree of IRs, all McinIR candidates clustered with their ionotropic receptor orthologs into separate sub-clades (Fig. 6). Because of the relative high conservation of IRs, all the splits of McinIRs were strongly supported by high support values. The candidate IR unigenes were named according to their similarity to known IRs. The information, including unigene reference, length, and best blastx hit of all IRs were listed in Table 6.

Tissue and sex specific expression profile of candidate M. cingulum chemosensory receptors.
We performed reverse transcription PCR (RT-PCR) analyses in different tissues of adult males and females to explore the expression patterns of M. cingulum OR, GR and IR genes. Most of OR genes were expressed in male and female antennae, the crucial chemosensory organs, suggesting a functional role of these genes in olfaction (Fig. 7). The candidate OR31, 59, 62, 64, and 65 showed a male antenna specific expression, while only one OR25 was expressed only in female antennae. The remaining ORs were expressed in both sexes, by differential expressions in male or female among tissues. Five of them, OR11, 14, 54, 55 and 81 were most highly (>200 M. pixel) expressed in both male and female antennae (Fig. 7).  (Fig. 8).
The quantitative real-time PCR (qPCR) was used to investigate the gustatory receptor transcript abundances in the male antennae, head with mouth parts, legs, body and female antennae, head with mouth parts, legs and body tissues. By comparing expression levels McinGR10, 13, 14, 17, 18 and 19 genes were expressed at similar level in all tested tissues except 14, 18, and 19 in body of both sexes (Fig. 9)

Discussion
This is the first comprehensive analysis of a polyembryonic endoparasitoid antennal transcriptome for the purpose to identify the major chemosensory receptor gene families (ORs, GRs and IRs) for olfaction. The identified gene families represent a valuable genomic resource of molecular basis in M. cingulum due to potential target genes for manipulating parasitoid wasp's behavior and improving biocontrol techniques.
The GO annotation demonstrated that predicted three categorize functions of M. cingulum transcripts overall similar as those obtained from previous reports 34,51,61,62 . Identified individual transcripts of olfactory gene families were also comparable with other Dipteran, Coleopteran, Hymenopteran and Lepidopteran species from those of which the antennal transcriptome has been reported 20,34,61,63-66 . The comparison of these published data sets suggested a certain level of conservation in gene expression patterns in antennae.
From the M. cingulum antennal transcriptome, a total of 109 OR genes were identified including with our previous studies. The total number of identified ORs in M. cingulum greater than the M. mediator (68 ORs) 51 but less than in A. mellifera (170 ORs) or N. vitripennis (301 ORs) 67,68 . Identified OR genes were only from the antennal transcriptome, ORs from other tissues thus might be difficult to identify in our study. However, the differences of identified OR gene number may result from sequencing methods and depth, and/or sample preparation. The large number of ORs identified in M. cingulum also could result from species difference; so, further research is required for confirmation.
Insect ORs mainly expressed in the antennae 69 . Present research revealed several ORs have sex specific expression, while others showed ubiquitous expression pattern but their expression level higher in antennae. The differential expression patterns of McinORs have been supported by previous study 51, [70][71][72] . The male antennae specific ORs (31, 59, 62, 64, 65, and 66) or male biased (18,19,22,45,38,41,43,46,48,49,50,52,53,57,60,63,67,69,70,79, and 82) expression profiles may play crucial roles in the detection of sex pheromone or male specific behaviors. The female specific OR25 or female biased OR27, 40, and 58, may suggest that the female specific behaviors i.e. finding host for ovipositions or others. Additionally, ORs expressed in other organ of wasp may have physiological functions, for example a Orco expressed in the testes of A. gambiae was considered to involve in spermatozoa activation 71 .
The GR family of insect chemoreceptors includes receptors for sugars and bitter compounds, as well as cuticular hydrocarbons and odorants such as CO 2 . Antennal GRs of insects used for tasting purpose as well as for olfaction detects 73 . However, there are no reports of polyembryonic endoparasitoid wasp GRs in antenna. So far we know, it is the first report of GR family in this wasp chemoreceptors, although some studies described the distribution of some gustatory sensilla in wasp antennae 51,74 . We identified 20 putative GR-encoding transcripts from the M. cingulum antennal transcriptomes. The identified GRs also included potential carbon dioxide receptors, which suggested that M. cingulum might use CO 2 detection as a cue for host selection, like in B. mori, T. castaneum and mosquitoes 21,75 . Orthologue GR64f clusters with sugar receptor of GR1 subfamily in Hymenoptera suggested a function of sugar detection. In addition, identified McinGRs were differentially expressed in sensory and non-sensory organs. However, the recent studies showed a wide range of non-gustatory sensory functions of insect GRs 76 , that indicated GRs probably have far more divergent functions in antennae.
Generally in insects, IRs are more conserved compared to ORs and GRs 19 , which can be categorized as divergent species-specific IRs and conserved antennal IRs 5 . The antennal IR subfamily only constituted a portion of IRs, while others belong to the divergent IRs subfamily, showed species-specific expansions that are particularly large in Diptera 19 . In D. melanogaster, there are 66 IRs, and 15 were antennal IRs 77 . In this study, 13 IR candidates including two co-receptors, (IR8a and IR25a) were found in antennal transcriptome. Limited number of IR genes identified in hymenoptera (only 6 in M. mediator, 10 in A. melifera and N. vitripennis) compared to diptera (66 in D. melanogaster) 19 . However, our identified IR genes close to the number of A. melifera and N. vitripennis but higher than the M. mediator. Sequence alignments showed that the putative McinIR8a and McinIR25a had high similarity with the MmedIR8a and MmedIR25a respectively. These two mostly expressed in female antennae than in male wasp, which probably play a significant roles in host recognition for oviposition 51 . However, the ubiquitous expression feature of McinIRs revealed that these genes may have other physiological functions in non-olfactory organs.
The M. cingulum transcriptome data indicated that the chemosensory gene repertoire was largely similar between the male and female, only differences in the relative levels of expression of individual ORs (Fig. 7). Therefore, while male and female antennae likely perceive similar odor stimuli, their sensitivities, and hence the odour significance to the male and female, may differ. Female M. cingulum uses host larval frass in combined with different volatile cues for host-searching. The herbivore induced plant volatiles (HIPVs) and green leaf volatiles (GLVs) are consider to be used by female wasps from infested plants by host insect as chemical cues 78,79 . The female specific or biased ORs in antennae may play important role in recognition of host volatiles, which can provide a key starting to manipulate and developed OR in wasp for finding host and used as a biological tools for pest control.

Conclusion
This study reports the first antennal transcriptome analysis in polyembryonic endoparasitoid M. cingulum. The genes reported here provide valuable insight into the molecular mechanisms of olfaction of this wasp. Ultimately, a large number of ORs, GRs and IRs in M. cingulum are identified, however the additional molecular and functional experiments are required to confirm the expression and roles of these genes. Our results provide a foundational knowledge to explore and understand the chemosensory receptor gene families of this wasp. It is promising to conduct transcriptomic analysis via next generation RNA-Seq for non-model organisms especially for polyembryonic parasitoid. Total RNA extraction. Antennae were cut from 1-2 days old male and female wasps following snap freezing in liquid nitrogen. The collection of head with mouth parts, legs, thoraxes, and abdomen (wingless) collected from same aged wasps were used for the RT-PCR validation of gene sequences. All the tissues were immediately stored at − 80 °C for further processes. Total RNA was extracted from the antennae or other tissues using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) as per manufacturer's instructions. The RNA integrity was verified by 1% agarose gel electrophoresis and quantity was assessed with a Nanodrop ND-2000 spectrophotometer Antennal transcriptome generations. Synthesis of cDNA and Illumina library generation was completed at Novogene Co., Ltd. Beijing, China, using Illumina HiSeq2500 sequencing. The FastQC tool was used to obtain read statistics, assess read quality, and to remove the low quality data. The high-quality reads were obtained by removing adaptor sequences, empty reads low-quality sequences (reads with unknown "N" >10% sequences), and the reads with more than 50% Q ≤ 20 base on the raw reads. All the analysis based on the clean reads. The the transcriptome data was combined and de novo assembled using Trinity 81,82 . Trinity RNA-Seq is highly capable of overcoming quality and polymorphism issues due to bubble popping algorithms in each of the three modules, Inchworm, Chrysalis and Butterfly. In order to get the comprehensive information of the genes, we annotated the genes based on Nr, Nt, Pfam, KOG/COG, Swiss-prot, KEGG, GO databases. Open reading frames were predicted using ESTScan 3.0 project. Gene expression levels were estimated by RSEM software 83 , and differential expression analysis of two groups was performed using the DESeq R package (1.10.1) 84 . P-value was adjusted using q-value 85 and q-value < 0.005 and log 2 (fold_change) >1 was set as the threshold for significantly differential expression.
Gene identification and annotation. For sequence homology assessment of both male and female M.
cingulum antennal transcriptomes, gene ontology (GO) annotation was performed using Blast2Go via searches against the NCBI non-redundant protein database (using BLASTp with a 1e −10 threshold) 86,87 . GO annotated genes or transcripts were described into three domains: to molecular function, biological process, and cellular

RT-PCR analysis.
To explore the expression of the ORs identified from the antennal transcriptome and compare the differential expression patterns between the sexes, RT-PCR was conducted with cDNAs prepared from the male antenna, female antenna, male and female body (including head, leg, thoraxes, abdomen) for OR genes and male and female antennae, head with maxillary palp, leg and body for GR and IR genes. Independent triplicate individual samples of total RNA were isolated from the above mentioned tissues and corresponding cDNAs were synthesized using the TranScript ® one-step gDNA removal and cDNA synthesis supermix (TRANSGENE Biotech, Beijing, China) following the kit manual. β-actin was used as reference gene (accession number-EU585777.1) and it was used to select the cDNA templates on the PCR equipment. Primers were designed manually or by Primer 5 (http://frodo.wi.mit.edu/primer5/), which was listed in supplementary material (Table S1). Individual PCR reactions were repeated three times; controls consisted of no template PCRs. The PCR conditions consisted of an initial 3-min step at 94 °C, 30 cycles of 94 °C for 30-sec, 56, 57 or 59 °C (depending on primers) for 30-sec and 72 °C for 3-min and finally 10-min step at 72 °C. Products were analyzed on a 1% agarose gel and visualized after staining with ethidium bromide. The images were recorded digitally by Dolphin-DOC (Wealtec Corp.) using 1141101 CCD-Camera, 12 V ac/dc and stored on computer. The brightness of each bands were measured from digital images by using Adobe Photoshop version CS3.  96 . All data were normalized to endogenous β-actin levels from the same tissue samples and the relative fold change in different tissues was calculated with the transcript level of the abdomen as calibrator. Thus, the relative fold change in different tissues was assessed by comparing the expression level of each GR in other tissues to that in the abdomen.