Transcriptome comparison of the sex pheromone glands from two sibling Helicoverpa species with opposite sex pheromone components

Differences in sex pheromone component can lead to reproductive isolation. The sibling noctuid species, Helicoverpa armigera and Helicoverpa assulta, share the same two sex pheromone components, Z9-16:Ald and Z11-16:Ald, but in opposite ratios, providing an typical example of such reproductive isolation. To investigate how the ratios of the pheromone components are differently regulated in the two species, we sequenced cDNA libraries from the pheromone glands of H. armigera and H. assulta. After assembly and annotation, we identified 108 and 93 transcripts putatively involved in pheromone biosynthesis, transport, and degradation in H. armigera and H. assulta, respectively. Semi-quantitative RT-PCR, qRT-PCR, phylogenetic, and mRNA abundance analyses suggested that some of these transcripts involved in the sex pheromone biosynthesis pathways perform. Based on these results, we postulate that the regulation of desaturases, KPSE and LPAQ, might be key factor regulating the opposite component ratios in the two sibling moths. In addition, our study has yielded large-scale sequence information for further studies and can be used to identify potential targets for the bio-control of these species by disrupting their sexual communication.

regulatory mechanisms that determine these species-specific ratios 22 , but the mechanisms remains not well known especially from the molecular perspective. Therefore, we constructed and sequenced cDNA libraries from the PGs isolated from H. armigera and H. assulta to investigate the genetic factors associated with sex pheromone biosynthesis in these two species.
After analysis, we identified 108 and 93 putative pheromone biosynthesis, transport, and degradation transcripts in the PGs of H. armigera and H. assulta, respectively. Our results together with previous studies 22,24 support the conjecture that the regulation of DESs is likely to play an important role in determining the opposite sex pheromone components ratios in the two species. In addition, our results also provide large-scale sequence information for further studies and identification of potential targets to disrupt sexual communication in H. armigera and H. assulta for the control of these lepidopterans.

Results
Overview of the PG transcriptomes. PGs from H. armigera and H. assulta were collected as previously described for the Heliothis virescens PG transcriptome 25 (Fig. 1) followed by construction of the corresponding cDNA libraries. Large-scale transcripts were assembled and annotated in the PG transcriptomes from H. armigera and H. assulta (Supplementary Table S1 online).
GO annotation was used to classify the PG transcripts into functional categories. GO terms were represented in all three major GO categories: biological process, cellular component, and molecular function. The most represented sub-category in the biological process category was cellular process, in the cellular component category it was cell and cell part, and in the molecular functions category, binding and catalytic activity were the most represented (Fig. 2).
Identification of putative genes involved in pheromone biosynthesis, transport, and degradation in the two Helicoverpa species. After removal of repetitive sequences following blastX against the NCBI Nr database and alignment with ClustalX 2.0, we identified a total of 108 and 93 putative transcripts involved in the pheromone biosynthesis, transport, and degradation in H. armigera and H. assulta PGs, respectively (Tables 1 and 2). These transcripts belonged to gene families represented by multiple transcripts in these two moth species. For example, ACC had 2 members in the 2 species each, alcohol dehydrogenase (ALR) was represented by 17 and 18 sequences in H. armigera and H. assulta, DES with 7 and 8, FAS with 3 and 3, FAR with 18 and 13, CSP with 19 and 16, OBP with 26 and 23, aldehyde dehydrogenase (AD) with 9 and 6, and AOX with 7 and 4 members respectively, in H. armigera and H. assulta (Tables 1 and 2,  Supplementary Tables S2-S5 online).
Tissue expression profile and mRNA abundance of the sex pheromone biosynthesis putative genes. We further characterized the expression levels and tissue expression pattern of the transcripts putatively involved in pheromone biosynthesis by semi-quantitative RT-PCR and qRT-PCR. Transcript abundance in the PG was also calculated as RPKM (reads per kilobase per million mapped reads). For this analysis, H. armigera sequences had the prefix Harm and H. assulta sequences had the prefix Hass followed by the gene name. The results showed that all the analysed transcripts had different expression patterns and most orthologous transcripts had similar expression profiles (Figs. 3 and 4).
We identified two ACCs from the PGs of both H. armigera and H. assulta (Tables 1 and 2). Semi-quantitative RT-PCR and qRT-PCR results revealed that HarmACC2 and HassACC2 were highly expressed in PGs compared to the female body without the PGs Figure 1 | Dissection of Helicoverpa armigera and Helicoverpa assulta sex pheromone glands. The pheromone glands in H. armigera (a) and H. assulta (b) were squeezed out from the abdomen using forceps (the gland is similarly inflated when the female calls). The abdomen of H. armigera (c) and H. assulta (e) were cut at the sclerotized cuticle from the 8 th abdominal segment, and the sclerotized cuticle was removed (H. armigera (d) and H. assulta (f)) before immersing the glands in liquid nitrogen. 1: Sclerotized ovipositor valves; 2: Pheromone gland; 3: Sclerotized cuticle that was removed. Phylogenetic analyses of the DESs. To further investigate the function of the DESs from H. armigera and H. assulta, 15 candidate DESs from these two species were phylogenetically analysed with other lepidopteran DESs (Fig. 5). In the resulting   Tables S2 and S3 online). Semi-quantitative RT-PCR results indicated that the orthologous transcripts had similar expression profiles (Fig. 6). Most of the OBPs were highly expressed in antennae and/or PGs, indicating their function in the detection and protection of plant volatiles, oviposition-deterring pheromones, and sex pheromones. Most CSPs were expressed in a range of tissues, suggesting common functions. Similar to OBPs, several CSPs in the Helicoverpa species were highly expressed in antennae and/or PGs (Fig. 6).

Discussion
Speciation in insects is often associated with changes in mate recognition systems. Particularly, sex pheromone-induced behaviours play crucial roles in insect reproduction and contribute significantly to reproductive isolation 26 . In moths, sex pheromones are synthesized in the PGs. Both H. armigera and H. assulta, which are sibling noctuid species, use the sex-pheromone components, Z9-16:Ald and Z11-16:Ald. However, the components are present in opposite ratios in the two species. Intrigued by this, we investigated differences in the transcripts related to sex pheromone biosynthesis, transport and degradation in the two sibling species by sequencing the transcriptomes from the PGs of the two Helicoverpa species. A total of 108 and 93 putative pheromone biosynthesis, transport, and degradation transcripts were respectively identified in H. armigera and H. assulta PGs. Further characterization of these transcripts by semi-quantitative RT-PCR, qRT-PCR, phylogenetic, and mRNA abundance analyses revealed that some of the transcripts had three characteristics: 1) transcripts that are distinctly or highly expressed in PGs than female body (without PG), 2) transcripts that are more abundant than the other transcripts in the PGs, and 3) transcripts that are homologous to other insect genes with demonstrated function in sex pheromone biosynthesis. Generally, the pheromone biosynthesis pathway in moths begins with the ATP-dependent carboxylation of acetyl-CoA to malonyl-CoA catalysed by ACC 27 . Compared to other ACCs, HarmACC2 and HassACC2 were highly expressed in PGs and were highly abundant than the other ACCs. In pheromone biosynthesis, FAS has been shown to use malonyl-CoA and NADPH to produce fatty acids 4 . In this study, none of the FASs displayed PG-biased expression, although HarmFAS2 and HassFAS2 were highly abundant in the PG transcriptomes with high RPKM values compared to other FASs. Future studies on the functional characterization of the Helicoverpa ACCs and FASs may reveal their specific roles in pheromone biosynthesis.
During sex pheromone biosynthesis, DESs introduces double bonds at specific positions in fatty acid chains 28,29 . Previous studies with labelled fatty acids demonstrated that different pathways are used in the pheromone biosynthesis in H. armigera and H. assulta 22 to achieve the markedly different ratios in the sex pheromone components, Z9-16:Ald and Z11-16:Ald. Among the DESs identified in our study, HarmGATD, HarmLPAQ, HassKPSE, HassGATD, and HassLPAQ displayed PG-biased expression compared with the adult female body. However, phylogenetic analyses showed that HarmGATD and HassGATD were not clustered into groups that were previously demonstrated to function in sex pheromone biosynthesis. On the other hand, HarmLPAQ and HassLPAQ were the members of D11-desaturases group, and HarmKPSE was closely related to HassKPSE in the D9-desaturases (16C . 18C) group. These two groups of desaturases share a conserved biological function in sex pheromone biosynthesis 30 . Previous studies on desaturases from H. assulta 24,31 , Helicoverpa zea 32 and Trichoplusia ni 9,33 showed that HassKPSE encodes a D9-desaturase. The action of this D9-desaturase results in the production of higher amounts of Z9-16:Acid than Z9-18:Acid 24 . HassLPAQ shown to encode a D11desaturase that specifically produced Z11-16:Acid. HzPGDs2 32 , TniKPSE 33   Interestingly, HarmKPSE did not show PG-biased expression, suggesting that this gene is not involved in the sex pheromone biosynthesis. This results is well consistent with the previous labelling study 22 with D 3 -16:Acid and D 3 -18:Acid showed that Z11-16:Ald is produced by D11 desaturation of 16:Acid in both H. armigera and H. assulta. However, Z9-16:Ald is produced by D11 desaturation of 18:Acid and one cycle of two-carbon chain shortening in H. armigera, while Z9-16:Ald is mainly produced by D9 desaturation of 16:Acid and by D11 desaturation of 18:Acid and one cycle of twocarbon chain shortening in H. assulta 22 .Therefore, unlike the HassKPSE, HarmKPSE that encodes a D9-desaturase is not likely to be involved in sex pheromone biosynthesis.
On the other hand, PG abundance is another characteristic feature of the genes involved in sex pheromone biosynthesis. The high abundance of HarmLPAQ, HassKPSE and HassLPAQ in the PG transcriptomes suggest that these high abundance and PG biased transcripts may have a role in sex pheromone biosynthesis in the two Helicoverpa species. Furthermore, the abundance of HassKPSE (D9) was 7-fold higher than HassLPAQ (D11) in the H. assulta PG transcriptome was consistent with the major pheromone component being Z9-16:Ald in H. assulta. As compared with HarmLPAQ, the lower abundance of HarmKPSE (about 239-fold) is consistent with that HarmKPSE is not likely to be involved in sex pheromone biosynthesis. Together our data along with others reported previously 24,22 suggest that among the DESs identified in our study, only HarmLPAQ (D11) is likely involved in sex pheromone biosynthesis in H. armigera, while both HassLPAQ (D11) and HassKPSE (D9) may be involved in this process in H. assulta.
Mutations that affect gene regulation could be more important in evolution than those changing the amino acid sequence of a protein 34 . In our study, HarmKPSE and HassKPSE had high amino acid identity (99.72%) indicating similar function. But, their expression patterns were different, and the mRNA abundance of HassKPSE was 39-fold higher in the H. assulta PG than HarmKPSE in H. armigera PG, while HassLPAQ was 30-fold lower than HarmLPAQ. Therefore, we presume that the regulation of DESs in these two Helicoverpa species likely resulted in the evolution of different pathways in the sex pheromone biosynthesis resulting in the final opposite ratios between two sex pheromone components. Further studies on regulation of DESs and its function are needed to determine their specific roles in the biosynthesis pathways of these two Helicoverpa species.
After the introduction of a specific double bond in the sex pheromone biosynthesis pathway, the fatty acyl CoA pheromone precursors are further reduced to the corresponding alcohols by FAR [35][36][37] and then catalysed by ALR. Among the FARs and ALRs identified in this study, HarmFAR12, HassFAR6, HarmALR2, and HassALR15 not only showed PG-biased expressions but also displayed a higher abundance than the others in the PGs suggesting their role in sex pheromone biosynthesis.
Some olfactory sensilla are distributed on the ovipositor 38,39 , which may function in the olfactory detection of plant odours, ovipositordeterring pheromones, and sex pheromones. OBPs and CSPs are thought to be responsible for the binding and transport of hydrophobic molecules, including pheromones and plant volatiles 13,15 . After sex pheromones have stimulated the olfactory receptor neurons, they must be rapidly removed by AD and/or AOX to restore the sensitivity of the sensory neuron 16 . The OBPs and CSPs that are mainly expressed in antennae and PGs could play important roles in the binding and transport of plant volatiles, oviposition-deterring pheromones, and sex pheromones. On the other hand, antennae and PGs highly expressed ADs, and AOXs, which could be involved in degrading sex pheromone and aldehyde odorants 16,40 .
In conclusion, we sequenced the PG transcriptomes in the two noctuid sibling species, H. armigera and H. assulta to identify the mechanisms regulating the opposite ratios of the sex pheromone components, Z9-16:Ald and Z11-16:Ald in the two species. Our analyses based on phylogeny, transcript expression, and transcript abundance indicates that a number of transcripts with PG-biased expression could be involved in the sex pheromone biosynthesis in the two species. Particularly, DESs seem to play a prominent role in the regulation of the component ratio in H. armigera and H. assulta. Additional functional analyses are needed to verify this conjecture in future.

Methods
Insect samples. Helicoverpa armigera were collected from cotton fields in the Institute of Cotton Research at the Chinese Academy of Agricultural Sciences. Helicoverpa assulta were provided by the Henan University of Science and Technology in China. Both species were reared in the laboratory on an artificial diet 41 at 25 6 1uC, 14510 L:D light cycle, and 65 6 5% relative humidity. Pupae were sexed and kept separately in cages for eclosion. The pupae were checked daily for emergence and supplied with 10% honey solution as food for the emerging adults.
Tissue collection. To construct cDNA libraries, 15 PGs from 3-day-old virgin females from each of the two species were collected at 5 h in scotophase (Fig. 1), immediately frozen in liquid nitrogen, and stored at 280uC until further use. In addition, for semi-quantitative RT-PCR and qRT-PCR, female antennae (FA), male antennae (MA), pheromone glands (PGs), whole insect body without pheromone glands (B1), and whole insect body without pheromone glands and antennae (B2) were also collected from three-day-old virgin insects. These tissues were immediately frozen and stored at 280uC until RNA isolation.
cDNA library construction and Illumina sequencing. Total RNA was extracted from the PGs of H. armigera and H. assulta using TRIzol reagent (Invitrogen, Carlsbad, CA, USA), and cDNA library construction and Illumina sequencing of the samples were performed at the Beijing Genomics Institute, Shenzhen, China 42 . For each species, poly-adenylated RNA were isolated from 20 mg of pooled total RNA using oligo (dT) magnetic beads. Then, the mRNA from each species were fragmented into short pieces in the presence of divalent cations in fragmentation buffer at 94uC for 5 min. Using the cleaved fragments as templates, random hexamer primers were used to synthesize first-strand cDNA using the. Second-strand cDNA was generated using the buffer, dNTPs, RNAseH, and DNA polymerase I. Following end repair and adaptor ligation, short sequences were amplified by PCR and purified with a QIAquickH PCR extraction kit (Qiagen, Venlo, The Netherlands), and sequenced on a HisSeq TM 2000 platform (Illumina, San Diego, CA, USA). The clean reads obtained in this study are available at the NCBI/SRA database under accession numbers SRR1565435 and SRR1570898.
Assembly and annotation. The PG transcriptomes of H. armigera and H. assulta were assembled de novo using the short-read assembly program Trinity 43 , which generated two classes of transcripts: clusters (prefix CL) and singletons (prefix U). Transcripts larger than 150 bp were compared to existing sequences in the protein databases, including NCBI Nr, Swiss-Prot, KEGG 44 , and COG, using blastX. We then used the Blast2GO program 45 for GO annotation of the transcripts and WEGO software 46 to plot the GO annotation results.
Analysis of transcript expression in the pheromone glands. Transcript expression abundances were calculated by the RPKM (reads per kilobase per million mapped reads) method 47 , which can eliminate the influence of different transcript lengths and sequencing discrepancies in calculating expression abundance 47 . RPKM was calculated from the equation (1): where RPKM (A) is the expression of transcript A; C is the number of reads uniquely aligned to transcript A; N is the total number of fragments uniquely aligned to all transcripts; and L is the number of bases in transcript A.
Phylogenetic analysis. To investigate the phylogenetic relationships between the two Helicoverpa species, we compared all putative transcripts involved in the pheromone biosynthesis, reception, and degradation in each of the two species using ClustalX2.0 with default settings 48 . Since desaturases are the most extensively studied class of enzymes involved in sex pheromone biosynthesis, we imported 67 lepidopteran desaturases 28 sequences from NCBI Nr and those from H. armigera and H. assulta. All phylogenetic trees were constructed using the neighbour-joining method implemented in MEGA6 with default settings and 1000 bootstrap replicates.
Semi-quantitative RT-PCR analysis. Total RNA was isolated using the SV Total Isolation System (Promega, Madison, WI, USA) according to the manufacturer's instructions. Single-stranded cDNA templates were synthesized using 1 mg of total RNA from various samples (FA, MA, PGs, B1 and B2) using the Reverse Transcription System (Promega) following the instructions in the manual.
Specific primers for the transcripts putatively involved in pheromone biosynthesis, reception, and degradation were designed using Beacon Designer 7.7 (Premier Biosoft, Palo Alto, CA, USA) (Supplementary Table S6 online). Semi-quantitative PCR experiments with negative controls (no cDNA template) were performed as follows: 94uC for 2 min; followed by 28 cycles at 94uC for 30 sec, 60uC for 30 sec, and 72uC for 30 sec. The reactions were performed in 20 mL PCR reactions containing 2.0 mL of 103 Ex-Taq PCR buffer, 1.6 mL of dNTPs (10 mM), 0.8 mL of forward primer (10 mM), 0.8 mL of reverse primer (10 mM), 15 ng of single-stranded cDNA, and 0.2 mL Ex-Taq (5 U/mL). PCR products were analysed by electrophoresis on 2.0% w/v agarose gel in TAE buffer and the resulting bands were visualized with GluRed according to the manufacturer's instructions. The GTP-binding protein (GenBank No. AY957405) from H. armigera was used as an endogenous control. Each reaction had three independent biological replicates.
Quantitative real time PCR and data analysis. Total RNA and cDNA synthesis were performed as described for semi-quantitative RT-PCR analysis. qRT-PCR was performed in a MastercyclerH ep realplex (Eppendorf, Hamburg, Germany) with primers designed based on the Helicoverpa nucleotide sequences from the Illumina data using Beacon Designer 7.7 (Supplementary Table S7 online). The H. armigera GTP-binding protein (AY957405) and GAPDH (JF417983) were used as reference genes. Expression levels of the tested mRNA were determined using the GoTaqH qPCR Master Mix (Promega) according to the manufacturer's instructions. A blank control without template cDNA (replacing cDNA with H 2 O) served as the negative control. Each reaction had three independent biological replicates and was repeated three times (technical replicates). Relative expression levels were calculated using the comparative 2 2ggCT method 49 .
Statistical Analysis of data. Data (mean 6 SE) from various samples were subjected to one-way nested analysis of variance (ANOVA) followed by a least significant difference test (LSD) for mean comparison. Two-sample analysis was performed by Student's t-test using SPSS Statistics 17.0 (IBM, Chicago, IL, USA).